[% setvar title Arrays: Apply operators element-wise in a list context %]
Note: these documents may be out of date. Do not use as reference! |
To see what is currently happening visit http://www.perl6.org/
Arrays: Apply operators element-wise in a list context
Maintainer: Jeremy Howard <j@howard.fm> Date: 10 Aug 2000 Last Modified: 21 Sep 2000 Mailing List: perl6-language-data@perl.org Number: 82 Version: 4 Status: Frozen
The first source of discussion was around whether there is any consistent
meaning to array operations. The source of this was that some felt that
other people may want *
to mean something other than element-wise
multiplication by default (e.g. matrix inner product). However no-one
actually said that they wanted it to work this way, only that others
may prefer it, particularly mathematicians. The standard use of
element-wise operations in mathematical programming languages such as
Mathematica and J suggests that this is unlikely to be a source of
confusion in practice.
The second source of discussion was around whether ||
and &&
should
be an exception, as specified in RFC 45. This is discussed in detail in
the CONFLICTS section.
It is proposed that in a list context, operators are applied element-wise to their arguments. Furthermore, it is proposed that this behaviour be extended to functions that do not provide a specific list context.
Currently, operators applied to lists in a list context behave counter-intuitively:
@b = (1,2,3); @c = (2,4,6); @d = @b * @c; # Returns (9) == scalar @b * scalar @c
This RFC proposes that operators in a list context should be applied element-wise to the elements of their arguments:
@d = @b * @c; # Returns (2,8,18)
If the lists are not of equal length, an error is raised.
RFC 202 describes multidimensional arrays in Perl 6. Element-wise list operations also apply to multidimensional arrays:
my int @mat1 = ([1,2], [3,4]); my int @mat2 = ([2,2], [1,1]); my @mat3 = @mat1 * @mat2; # ([2,4],[3,4])
An error is raised if the two arrays do not have equal dimensions.
If an operator is used in a list context with one list (or multidimensional array), and one or more scalars, the scalars are treated as if they were an array of that scalar with the same dimensions as the array:
my int @mat1 = ([1,2], [3,4]); @e = @mat1 * 2; # ([2,4],[6,8]) @f = @mat1 * (2,2,2) # Same thing
If one operand is a vector of the same bounds as the equivalent dimension of the other operand, the vector's elements are 'broadcast' across every other dimension of the other operand:
my int @mat1 = ([1,2], [3,4]); my int @vec1 = (2,3); # 1st dimension @g = @mat1 * @vec1; # ([2,4],[9,16]) my int @vec2 = ([2],[3]); # 2nd dimension @h = @mat1 * @vec2; # ([2,6],[6,12])
If the operands are a column vector and a row vector, the elements of each vector are combined into a two dimensional array:
my int @vec1 = (2,3); # 1st dimension my int @vec2 = ([2],[3]); # 2nd dimension @i = @vec1*@vec2; # ([2*2,3*2],[2*3,3*3]) == ([4,6],[6,9])
Equivalent combinatorial broadcasting occurs if the operands are perpendicular planes (creating a cube), and so forth for higher dimensional arrays.
Functions that do not return a list should be treated in the same way:
@e = (-1,1,-3); @f = abs(@e); # Returns (1,1,3)
If @first_names contains a list of peoples first names, and @surnames contains their surnames, this creates a new list that concatenates the elements of the two lists:
@full_names = @first_names . @surnames;
To quote a number of lines of a message by prefixing them all with '> ':
@quoted_lines = '> ' . @raw_lines;
To create a histogram for a list of scores:
@people = ('adam', 'eve ', 'bob '); @scores = (7,9,5); # Score for each person @histogram = '#' x @scores; # Returns ('xxxxxxx','xxxxxxxxx','xxxxx') print join("\n", @people . ' ' . @histogram); adam xxxxxxx eve xxxxxxxxx bob xxxxx
This snippet multiplies the absolute values of three arrays together and sums the results, in a very efficient way:
@b = (1,2,3); @c = (2,4,6); @d = (-2,-4,-6); $sum = reduce ^_+^_, abs(@b * @c + @d);
Lists can be reordered or sliced with list generation functions (RFC 81) allowing flexible data manipulation:
@a = (3,6,9); @reverse = (3..1); @b = @a * @a[@rev]; # (3*9, 6*6, 9*3) = (27,36,27)
Slicing plus array operations makes matrix algebra easy:
@a = (1,2,3, 2,4,6, 3,6,9); @column1of3 = (1..7:3); # (1,4,7) - every 3rd elem from 1 to 7 @row1of3 = (1..3); # (1,2,3) $sum_col1_by_row1 = sum ( @a[@column1of3] * @a[@row1of3] ); # (1*1+2*2+3*3)=14
"RFC 45" specifies alternative semantics for &&
and ||
in a list
context, which is to evaluate the operands in a scalar context and then
propagate the result to the left-hand side in a list context. The authors
have been unable to resolve this conflict.
This author does not support RFC 45, partly because it breaks consistency with RFC 82, and partly because it makes || and && act in a weird cross between scalar and list context (evaluates in scalar context, propagates in list context).
Also, boolean operations are frequently applied to lists of elements. There is really nothing about this operation that makes it only useful for scalars. For example:
@mask = (1,0,1,0); @a = (4,3,2,1); @b = ('a','b','c','d'); @mult_a = @mask * @a; # (4,0,2,0) @mask_a = @mask && @a; # (4,0,2,0) @mask_b = @mask && @b; # ('a',0,'c',0)
Finally, the control-flow/short-circuiting side effet of || and && would be very useful for array operations. In lazily generated lists, using || or && for an element-wise operation would avoid computing elements of the RHS where not necessary to do so. RFC 45's proposal really only talks about flow-control (rather than boolean operations), but flow control statements can be used directly:
@a = @c unless @a = @b;
achieves the same thing as RFC 45's proposed
@a = @b || @c
without the double evaluation of @b caused by
@a = @b ? @b : @c
These operators and functions should be evaluated lazily. For instance:
@b = (1,2,3); @c = (2,4,6); @d = (-2,-4,-6); $sum = reduce ^_+^_, @b * @c + @d;
should be evaluated as if it read:
$sum = 0; $sum += $b[$_] * $c[$_] + $d[_] for (0..$#a-1));
That is, no temporary list is created, and only one loop is required.
The proposal to handle functions is tricky, since there is currently no obvious way to see whether a function is going to return a list. For instance, in the case:
@b = abs(@a);
we either need some kind of more advanced prototyping (or other way of creating a signature) so that Perl knows to apply abs() to the elements of @a, or we need to manually change abs to check for list context and Do The Right Thing.
The Mathematica Navigator, Heikki Ruskeepää, Academic Press, ISBN 0-12-603640-3, p383.
Expression Templates (C++ Implementation): extreme.indiana.edu eldhui/papers/techniques/techniques01.html#l32
Implementation in Perl Data Language: pdl.perl.org
Array operations in Numerical Python: starship.python.net#SEC10
RFC 76: reduce
RFC 23: Higher-order functions
RFC 81: Lazily evaluated list generation functions
RFC 202: Overview of multidimensional array RFCs