[% setvar title Pattern matching on perl values %]
Note: these documents may be out of date. Do not use as reference! |
To see what is currently happening visit http://www.perl6.org/
Pattern matching on perl values
Maintainer: Steve Fink <steve@fink.com> Date: 19 Sep 2000 Mailing List: perl6-language@perl.org Number: 261 Version: 1 Status: Developing
A pattern matching primitive that operates on perl values would allow sophisticated data slicing and dicing to be specified concisely.
Best illustrated by examples:
# Nearly identical to current behavior my ($a, $b, $c) = @_; # Changes current behavior. $b is set only if ! defined $_[0] my (undef, $b) = @_; # Nearly equivalent to the present my (undef, $b) = @_ my (?, $b) = @_ # Equiv to my $name = $hashref->{name}; my @kids = @{$hashref->{children}} my { 'name' => $name, children => [ @kids ] } = $hashref; # Operationally equiv to my $b = $_[1]->{food}; # my ($a) = grep { $_[1]->{$_} eq 'fish' } keys %{$_[1]} # even if $_[1]->{food} eq 'fish' my (undef, { $a => 'fish', food => $b }) = @_; # $a is an arbitrarily chosen key of %$h, and $b is its value. # $c is a key whose value is a hash ref containing 'needle' as a key. # In this and all other examples, if any pattern match fails, # the whole match fails and does not assign any variables. my { $a => $b, $c => { needle => ? } } = $h; # If you are not declaring variables, 'match' performs the same operation # for existing variables. This one sets $a to $foo[0] unless @foo==0. match ($a) = @foo; # Constants can be matched against too. match { 'Joe' => ? } = $h or die "Hash does not contain Joe"; # Equiv to scalar(grep { $_ == 1 } @list) match (..., 1, ...) = @list; # Pretty close to ($idx) = grep { $_[$_] == 1 } @_; $b = $_[$idx+1]; match (..., 1, $b) = @_; # It gets worse! This gives the value associated with a key matching the # regular expression a*b: match { /a*b/ => $value } = \%h; # And if you want to know what the key was: match { $key = /a*b/ => $value } = \%h; # What if you want to grab out the index? This is like # ($i) = grep { $list[$_] =~ /foo/ } 0..$#list match ( $i => /foo/ ) = @list; # If you don't want to require the trailing parts of the pattern to match: match ($a, 1, $b, :optional $c, 2, $d) = @_; # Equivalent to my Dog $puppy = $h->{puppy}, whatever my Dog $puppy # is chosen to mean. my { 'puppy' => Dog $puppy } = $h; # Equiv to my Dog $temp = $h->{puppy}; $puppy = $temp; # (Performs whatever assertion checking my Dog $var would normally do.) match { 'puppy' => Dog $puppy } = $h; # Depending on the final meaning of my Dog $puppy, this may search for # something satisifying the my Dog $puppy assertions. So it would be # similar to # while (($k,$v) = each %$h) { # ($slot,$puppy) = ($k,$v), last if UNIVERSAL::isa($v, 'Dog'); # } my { $slot => Dog $puppy } = $h;
The syntax is
MATCH-EXPR : (my|match) PATTERN = EXPR PATTERN : '{' HASH-PATTERN-LIST '}' | '(' LIST-PATTERN-LIST ')' | '[' LIST-PATTERN-LIST ']' | 'undef' | '?' | '...' | VARIABLE | CONSTANT | REGEX | VARIABLE '=' PATTERN HASH-PATTERN : PATTERN '=>' PATTERN HASH-PATTERN-LIST : /* empty */ | HASH-PATTERN ',' HASH_PATTERN-LIST | ':optional' HASH-PATTERN ',' HASH_PATTERN-LIST LIST-PATTERN-LIST : /* empty */ | LIST-PATTERN ',' LIST-PATTERN-LIST | ':optional' LIST-PATTERN ',' LIST-PATTERN-LIST LIST-PATTERN : PATTERN | PATTERN => PATTERN
Both my
and match
return the number of variables successfully
matched. The behavior of
my $a = 1 if cond();
is changed to mean
my $a; $a = 1 if cond();
So that you can easily test the success of pattern matching:
my { $a => 'Bob' } = \%h or warn "Bob ain't here!";
For the case of match (...)
, each PATTERN is matched against the
corresponding element of the right hand side (rhs). If the rhs runs
out of elements, the whole match fails and the return value is
undefined. For example, match (@a, $b) = @_
will always leave both
@a and $b unaffected because @a will have eaten all of @_.
For the case of match {...}
, the rhs must be a reference to a hash.
Conceptually, each (key,value) pair is matched against all of the
HASH-PATTERN key and value pattern pairs in turn. If multiple
(key,value) pairs match the same PATTERN=>PATTERN pair, which ones are
matched to which is undefined (multiple patterns may be considered as
matching the same (key,value) pair, and vice versa.)
undef
now means that the matching value MUST be undefined, rather
than serving as a "don't care" placeholder as it does now in some
contexts. The placeholder function is taken over by ?
. ...
is a
variable-length placeholder matching zero or more list elements.
When using :optional
and no variables are matched, the special
value "0 but true" is returned.
It might be preferable to just die() on a failed match, although that would make it harder to do some of the grep replacements.
Forcing patterns to match unique list or hash elements would make this more powerful, but would make it harder to implement, slower, and possibly more confusing.
It would be very useful to match against a hash (not just a hash reference). Suggestions for syntax? This would allow
sub new { my ( __PACKAGE__ $self, %{ PeerAddr => $addr, :optional LocalAddr => $localaddr } ) = @_;
It might be better to leave my
unchanged and either only provide
match
or provide another keyword for both matching and creating the
lexical variables. If that were done, then we might also consider using
EXPR =~ PATTERN
rather than PATTERN = EXPR
.
my (undef, $a, undef) = @_;
would be rewritten
my (:optional ?, $a, ?) = @_;
In other words, all undef
's in my()
lists would be changed to
?
, and :optional
would be placed at the beginning of the list.
This is still a buggy migration, because pattern matching returns the
number of variables assigned, while perl5 returns the rhs list padded
to the number of elements on the lhs, or something like that. Avoiding
the use of my
for pattern matching would solve this, as would
making a perl5_my
operator.
Fairly straightforward. I long ago prototyped a simplified version of this in perl5, though it required you to quote the pattern and would only assign to global variables. In order to be useful, though, this needs to be implemented in C.
RFC 22: Control flow: Builtin switch statement
This feels similar to Damian Conway's switch
operator, and some
unification might be useful. I doubt the fully generalized union would
be as useful as two specialized primitives.
RFC 156: Replace first match function (?...?
) with a flag to the
match command
This is necessary to avoid parsing ambiguity with the ? placeholder.
RFC 164: Replace =~, !~, m//, s///, and tr// with match(), subst(), and trade()
This also uses the match
keyword, though either RFC could easily
switch to another.
RFC 170: Generalize =~ to a special "apply-to" assignment operator
This would provide the EXPR =~ match PATTERN syntax, similar to EXPR =~ PATTERN above.
RFC 218: my Dog $spot
is just an assertion
I've been assuming this or something similar in the semantics described.