[% setvar title Arrays: merge() and unmerge() %]

This file is part of the Perl 6 Archive

Note: these documents may be out of date. Do not use as reference!

To see what is currently happening visit http://www.perl6.org/

TITLE

Arrays: merge() and unmerge()

VERSION

  Maintainer: Jeremy Howard <j@howard.fm>
  Date: 10 Aug 2000
  Last Modified: 21 Sep 2000
  Mailing List: perl6-language-data@perl.org
  Number: 90
  Version: 4
  Status: Frozen

DISCUSSION

Two major issues were discussed. One was that merge and unmerge were not deserving of being placed in the core. In the end this is a judgement call, but merge() in particular is fundamental to programming in a functional style, to manipulating multidimensional matrices, and for iterating through multiple arrays simultaneously. Furthermore, implementing optimised aliasing behaviour as proposed may not be possible in a module (depending on what extension mechanisms are in Perl 6).

The other issue discussed was whether the aliasing behaviour is appropriate, and achievable. A section has been added to this RFC discussing this.

ABSTRACT

It is proposed that two new functions, merge, and unmerge, be added to Perl. merge(@list1, @list2, ...) would return a list that interleaved its arguments. unmerge($num_lists, @list) would reverse this operation. Both functions would return an alias into the original list, not a copy of the elements.

CHANGES

Since v3

Since v2

Since v1

DESCRIPTION

It is proposed that Perl implement a function called merge that interleaves the arguments of arrays together, and is evaluated lazily. For instance:

  @a = (1,3,5);
  @b = (2,4,6);
  @merged_list = merge(@a,@b);   # (1,2,3,4,5,6)

This makes it easy to operate on multiple lists using flexible reduction functions:

  $sum_xy = sub {reduce ^last+^x*^y, merge($_[0], $_[1])};
  print $sum_xy->(@a, @b);   # Prints '44', i.e. 1*2+3*4+5*6

In order to reverse this operation we need an unmerge function:

  @merged_list = merge(@a,@b);   # (1,2,3,4,5,6)
  @unmerged_list = unmerge(2, @merged_list);   # ([1,3,5], [2,4,6])

The second argument to unmerge is the number of lists that are to be created (i.e. the number of lists that would have been merged for this to reverse the operation).

If the list to be unmerged is not an exact multiple of the partition size, the final list references are not padded--their length is one less than the list size. For example:

  @list = (1..7);
  @unmerged_list2 = unmerge(3, @list);   # ([1,4,7], [2,5], [3,6])

Both merge and <unmerge> do not make a copy of the elements of their arguments; they simply create an alias to them:

  @a = (1,3,5);
  @b = (2,4,6);
  @merged_list = merge(@a,@b);   # (1,2,3,4,5,6)
  $merged_list[1] = 0;
  @b == (0,4,6);                 # True
    

IMPLEMENTATION

The merge and unmerge functions should be evaluated lazily.

merge and unmerge return an alias into the original list, not a copy of the elements.

Effectively, merge creates an iterator over multiple lists. If used as part of a reduction, the actual interleaved list need never be created. For instance:

  $sum_xy = sub {reduce ^last+^x*^y, merge($_[0], $_[1])};
  $answer = $sum_xy->(@a, @b);
  

should be evaluated as if it read:

  $answer = 0;
  $answer += $a[$_] * $b[$_] for (0..$#a-1));
  

which does not need to create an intermediate list.

Aliasing implementation

The proposed aliasing behaviour is also proposed for part()/flatten() (see "RFC 91") and reshape() (see "RFC 148"). This requires some more thought. Perl's current slicing operation @a[$x1, $x2] only aliases when used in an lvalue context:

  @a = (4,5,6);
  @b = @a[1,2];
  @b[0] = 9;       # @a == (4,5,6)
  @a[1,2] = (8,9); # @a == (4,8,9)

We could do the same for merge() and friends. The downside is that:

  @transpose = part(
    # Find the size of each column
    scalar @list_of_lists,
    # Interleave the rows
    merge(@list_of_lists);
  )

and similar expressions would do an awful lot of copying. Ideally if merge() didn't alias in an rvalue context, Perl would still optimise away multiple merge()s, part()s, slices, and so forth so that only one copy occurred.

If the aliasing behaviour is implemented, then assigning an aliased array to another array should result in a copy being created:

  @a = (1,3,5);
  @b = (2,4,6);
  @merged_list = merge(@a,@b);   # Just an alias into @a and @b
  @copied_list = @merged_list;   # Does an actual array copy

Alternatively, a copy-on-write optimisation could allow some of the efficiency of full aliasing to be combined with the simplicity of Perl 5's alias-in-lvalue behaviour.

REFERENCES

RFC 23: Higher order functions

RFC 76: Builtin: reduce

RFC 91: Arrays: part() and flatten()

RFC 148: Arrays: Add reshape() for multi-dimensional array reshaping