[% setvar title Lazily evaluated list generation functions %]

This file is part of the Perl 6 Archive

Note: these documents may be out of date. Do not use as reference!

To see what is currently happening visit http://www.perl6.org/

TITLE

Lazily evaluated list generation functions

VERSION

  Maintainer: Jeremy Howard <j@howard.fm>
  Date: 10 Aug 2000
  Last Modified: 22 Sep 2000
  Mailing List: perl6-language-data@perl.org
  Number: 81
  Version: 4
  Status: Frozen

DISCUSSION

Not surprisingly, the controversial point of discussion for this RFC was about viability and efficiency of implementation. These points were more about the use of lazy evaluation in general, rather than generated lists in particular. The viability of lazy evaluation has been proven in other languages (both functional and procedural). The efficiency of generated lists will obviously depend on the implementation, but this RFC suggests some obvious optimisations for frequently used constructs (e.g. stepped slices).

The second major point of discussion was around syntax. Other languages that provide list comprehension do so with quite different syntax (for example, Haskell and Python v2). However, the syntax is these languages in not at all Perlish. The proposed syntax incorporates Perl 5 syntax and extends it using minimal additional notation.

ABSTRACT

This RFC proposes that the existing .. operator produce a lazily evaluated list. In addition, a new operation : is proposed that allows for the generation of lazily evaluated lists based on any Perl expression.

This proposal only discusses these operators in a list context. The current meaning of '..' in a scalar context is not affected.

CHANGES

Since v3

Since v2

Since v1

DESCRIPTION

This RFC proposes that Perl incorporate a broader tool box of list generation techniques:

These techniques would allow programs written in Perl to follow a structure familiar to programmers used to numerical programming environments. It would provide a more compact notation for many common mathematical algorithms, and give Perl important information to make key optimisations.

Lazy evaluation of generated lists

The .. of previous Perls is a list generation operator, which creates a list based on its parameters:

  ($start..$stop); # ($start, $start+inc, $start+2*inc, ... $stop)

where 'inc' is 1 if $start<$stop, or -1 otherwise. The list is generated as soon as it is declared. These makes some code rather inefficient:

  @a = (1..1000000);   # One million element list generated here
  print $a[999999];

creates a one million element list despite only using one element of it. Under lazy evaluation, elements of the list are only created when they are required, and saved for later use. In the previous example only $a[999999] would be calculated by interpolation (not sequentially) and stored when using lazy evaluation.

Lists, whether generated lazily or not, are assumed to be stable. That is, the value of $a[999999] will be the same everywhere in a program, unless @a itself is modified. This means that lazily evaluated lists provide a handy notation for memoization, as we will see later. It is proposed that once an element has been calculated in a list, that it is cached for use later rather than recalculated each time.

Lazy list elements get calculated when they are output, or used in an expression that is output. If list elements are not output then they are never calculated.

Introduction of : to generate arbitrary lists

It is proposed that a new operator be added to Perl's list generation arsenal, :. The ':' character is chosen because it reflects standard notation for array slicing, which is an important use of this operator.

: is only meaningful when called in a list context, generating a lazily evaluated list in one of 3 ways.

EXTENSIONS

If infinite lists are available in Perl 6 (see RFC 24), the $num_steps and $end arguments to the list generation operators can be null. This indicates that there are infinite elements in the list:

  @column1of3 = (1.. :3);   # (1,4,7,...)
  @all_even_numbers = (1: ^0 * 2:);   # (2,4,6,8,...)

JUSTIFICATION

One particularly important use of generated lists is for slicing arrays. This is difficult in Perl 5, where it is tackled by Perl Data Language (PDL). Note that currently (perl5) we have to say

  $n1 = $n-1;  # since we need to stringify
  $y = $x->slice("0:$n1:4");

This should be contrasted with the less cluttered syntax offered by numerical Python and commercial systems such as Matlab and IDL:

  y = x[0:n-1:4];
  

The syntax in this RFC would provide notation that is familiar to users of standard numerical programming tools, but is also a natural (and compatible) step from the existing Perl .. operator.

IMPLEMENTATION

It will be common in numerical programming to see multiple layers of indirection:

  @a = (1:5:100000);
  @b = getBigImage();
  @c = @a[@b];
  print $c[5];

In these cases Perl should consolidate as much as possible at compile time to avoid too much overhead.

When a lazy list is passed to a function it is not evaluated. The function can then access only the elements it needs, which are calculated as required. Furthermore, the arguments that generated the list are available as attributes of the list, and can therefore be used directly without actually accessing the list:

  @a = (1:100000:5);
  ($gen_type) = attributes::get(@a);   # 'increment'
  ($start_val, $end_val, $increment) = attributes::get(@a)[1..3];  # (1, 100000, 5)
  
  @first5PowersOf2 = (1..sub {$_[1]*2}:5);   # (1,2,4,8,16)
  ($gen_type) = attributes::get(@a);   # 'recursive'
  ($start_val, $gen_func, $num_steps) = attributes::get(@a)[1..3];
  
  @even_numbers = (1: ^0 * 2: 5);   # (2,4,6,8,10)
  ($gen_type) = attributes::get(@a);   # 'function'
  ($start_val, $gen_func, $num_steps) = attributes::get(@a)[1..3];  

For lists that are a combination of these:

  @b = (1:10:2 , 10:^0*3:30);

there are two potential ways of allowing introspection of the arguments that generated the list:

However, if lazy list elements are calculated efficiently on demand, the need for a programmer to access the list generation parameters at all should be limited. They are provided to make sure that 'hard things are possible'.

REFERENCES

RFC 23: Higher-order functions

RFC 24: Semi-finite (lazy) lists

RFC 82: Apply operators component-wise in a list context

RFC 205: New operator ';' for creating array slices

RFC 231: Data: Multi-dimensional arrays/hashes and slices

Memoization in Perl: perl.plover.com

List comprehension in Haskell: www.numeric-quest.com#List comprehension

Fethi Rabhi and Guy Lapalme: Algorithms: A functional programming approach, Addison-Wesley, 235 pages, paperback, 1999. ISBN 0-201-59604-0

ACKNOWLEDGEMENTS

Damian Conway: Reviewed first draft

Karl Glazebrook: Suggested splitting from infinite lists proposal

Christian Soeller: Original 'slice' RFC; suggested argument reordering

Dan Sugalski: Implementation ideas