[% setvar title Arrays: Apply operators element-wise in a list context %]

This file is part of the Perl 6 Archive

Note: these documents may be out of date. Do not use as reference!

To see what is currently happening visit http://www.perl6.org/

TITLE

Arrays: Apply operators element-wise in a list context

VERSION

  Maintainer: Jeremy Howard <j@howard.fm>
  Date: 10 Aug 2000
  Last Modified: 21 Sep 2000
  Mailing List: perl6-language-data@perl.org
  Number: 82
  Version: 4
  Status: Frozen

DISCUSSION

The first source of discussion was around whether there is any consistent meaning to array operations. The source of this was that some felt that other people may want * to mean something other than element-wise multiplication by default (e.g. matrix inner product). However no-one actually said that they wanted it to work this way, only that others may prefer it, particularly mathematicians. The standard use of element-wise operations in mathematical programming languages such as Mathematica and J suggests that this is unlikely to be a source of confusion in practice.

The second source of discussion was around whether || and && should be an exception, as specified in RFC 45. This is discussed in detail in the CONFLICTS section.

ABSTRACT

It is proposed that in a list context, operators are applied element-wise to their arguments. Furthermore, it is proposed that this behaviour be extended to functions that do not provide a specific list context.

CHANGES

Since v3

Since v2

Since v1

DESCRIPTION

Currently, operators applied to lists in a list context behave counter-intuitively:

  @b = (1,2,3);
  @c = (2,4,6);
  @d = @b * @c;   # Returns (9) == scalar @b * scalar @c

This RFC proposes that operators in a list context should be applied element-wise to the elements of their arguments:

  @d = @b * @c;   # Returns (2,8,18)

If the lists are not of equal length, an error is raised.

Multidimensional array operations

RFC 202 describes multidimensional arrays in Perl 6. Element-wise list operations also apply to multidimensional arrays:

  my int @mat1 = ([1,2],
                  [3,4]);
  my int @mat2 = ([2,2],
                  [1,1]);
  my @mat3 = @mat1 * @mat2;   # ([2,4],[3,4])
  

An error is raised if the two arrays do not have equal dimensions.

Broadcasting

If an operator is used in a list context with one list (or multidimensional array), and one or more scalars, the scalars are treated as if they were an array of that scalar with the same dimensions as the array:

  my int @mat1 = ([1,2],
                  [3,4]);
  @e = @mat1 * 2;      # ([2,4],[6,8])
  @f = @mat1 * (2,2,2) # Same thing

If one operand is a vector of the same bounds as the equivalent dimension of the other operand, the vector's elements are 'broadcast' across every other dimension of the other operand:

  my int @mat1 = ([1,2],
                  [3,4]);
  my int @vec1 = (2,3);     # 1st dimension
  @g = @mat1 * @vec1;       # ([2,4],[9,16])
  my int @vec2 = ([2],[3]); # 2nd dimension
  @h = @mat1 * @vec2;       # ([2,6],[6,12])

If the operands are a column vector and a row vector, the elements of each vector are combined into a two dimensional array:

  my int @vec1 = (2,3);     # 1st dimension
  my int @vec2 = ([2],[3]); # 2nd dimension
  @i = @vec1*@vec2;         # ([2*2,3*2],[2*3,3*3]) == ([4,6],[6,9])

Equivalent combinatorial broadcasting occurs if the operands are perpendicular planes (creating a cube), and so forth for higher dimensional arrays.

Element-wise functions

Functions that do not return a list should be treated in the same way:

  @e = (-1,1,-3);
  @f = abs(@e);   # Returns (1,1,3)

EXAMPLES

Text processing

If @first_names contains a list of peoples first names, and @surnames contains their surnames, this creates a new list that concatenates the elements of the two lists:

  @full_names = @first_names . @surnames;

To quote a number of lines of a message by prefixing them all with '> ':

  @quoted_lines = '> ' . @raw_lines;

To create a histogram for a list of scores:

  @people = ('adam', 'eve ', 'bob ');
  @scores = (7,9,5);          # Score for each person
  @histogram = '#' x @scores; # Returns ('xxxxxxx','xxxxxxxxx','xxxxx')
  print join("\n", @people . ' ' . @histogram);
  
  adam xxxxxxx
  eve  xxxxxxxxx
  bob  xxxxx
  

Number crunching

This snippet multiplies the absolute values of three arrays together and sums the results, in a very efficient way:

  @b = (1,2,3);
  @c = (2,4,6);
  @d = (-2,-4,-6);
  $sum = reduce ^_+^_, abs(@b * @c + @d);

Lists can be reordered or sliced with list generation functions (RFC 81) allowing flexible data manipulation:

  @a = (3,6,9);
  @reverse = (3..1);
  @b = @a * @a[@rev];   # (3*9, 6*6, 9*3) = (27,36,27)

Slicing plus array operations makes matrix algebra easy:

  @a = (1,2,3,
        2,4,6,
        3,6,9);
  @column1of3 = (1..7:3);   # (1,4,7) - every 3rd elem from 1 to 7
  @row1of3 = (1..3);       # (1,2,3)
  $sum_col1_by_row1 = 
    sum ( @a[@column1of3] * @a[@row1of3] );  # (1*1+2*2+3*3)=14
  

CONFLICTS

"RFC 45" specifies alternative semantics for && and || in a list context, which is to evaluate the operands in a scalar context and then propagate the result to the left-hand side in a list context. The authors have been unable to resolve this conflict.

This author does not support RFC 45, partly because it breaks consistency with RFC 82, and partly because it makes || and && act in a weird cross between scalar and list context (evaluates in scalar context, propagates in list context).

Also, boolean operations are frequently applied to lists of elements. There is really nothing about this operation that makes it only useful for scalars. For example:

  @mask = (1,0,1,0);
  @a = (4,3,2,1);
  @b = ('a','b','c','d');

  @mult_a = @mask * @a;   # (4,0,2,0)
  @mask_a = @mask && @a;  # (4,0,2,0)
  @mask_b = @mask && @b;  # ('a',0,'c',0)

Finally, the control-flow/short-circuiting side effet of || and && would be very useful for array operations. In lazily generated lists, using || or && for an element-wise operation would avoid computing elements of the RHS where not necessary to do so. RFC 45's proposal really only talks about flow-control (rather than boolean operations), but flow control statements can be used directly:

  @a = @c unless @a = @b;

achieves the same thing as RFC 45's proposed

  @a = @b || @c

without the double evaluation of @b caused by

  @a = @b ? @b : @c

IMPLEMENTATION

These operators and functions should be evaluated lazily. For instance:

  @b = (1,2,3);
  @c = (2,4,6);
  @d = (-2,-4,-6);
  $sum = reduce ^_+^_, @b * @c + @d;
  

should be evaluated as if it read:

  $sum = 0;
  $sum += $b[$_] * $c[$_] + $d[_] for (0..$#a-1));

That is, no temporary list is created, and only one loop is required.

The proposal to handle functions is tricky, since there is currently no obvious way to see whether a function is going to return a list. For instance, in the case:

  @b = abs(@a);
  

we either need some kind of more advanced prototyping (or other way of creating a signature) so that Perl knows to apply abs() to the elements of @a, or we need to manually change abs to check for list context and Do The Right Thing.

REFERENCES

The Mathematica Navigator, Heikki Ruskeepää, Academic Press, ISBN 0-12-603640-3, p383.

Expression Templates (C++ Implementation): extreme.indiana.edu eldhui/papers/techniques/techniques01.html#l32

Implementation in Perl Data Language: pdl.perl.org

Array operations in Numerical Python: starship.python.net#SEC10

RFC 76: reduce

RFC 23: Higher-order functions

RFC 81: Lazily evaluated list generation functions

RFC 202: Overview of multidimensional array RFCs