[% setvar title TITLE %]

This file is part of the Perl 6 Archive

Note: these documents may be out of date. Do not use as reference!

To see what is currently happening visit http://www.perl6.org/

TITLE

Alternative lists and quoting of things

VERSION

  Maintainer: Richard Proctor <richard@waveney.org>
  Date: 27 Aug 2000
  Last Modified: 1 Oct 2000
  Mailing List: perl6-language-regex@perl.org
  Number: 166
  Version: 4
  Status: Frozen

ABSTRACT

Expand Alternate Lists from Arrays and Quote the contents of things inside regexes.

DESCRIPTION

These are a couple of constructs to make it easy to build up regexes from other things.

Alternative Lists from arrays

The basic idea is to expand an array as a list of alternatives. There are two possible syntaxs (?@foo) and just plain @foo. @foo might just have existing uses (just), therefore I prefer the (?@foo) syntax.

(?@foo) is just syntactic sugar for (?:(??{ join('|',@foo) })) A bracketed list of alternatives. But built at regex compile time maybe its @{[ join('|',@foo) ]}.

Quoting the contents of things

If a regex uses $foo or @bar there are problems if the content of the variables contain special characters. What is needed is a way of \Quoting the content of scalars $foo or arrays (?@foo).

Suggested syntax:

(?Q$foo) Quotes the contents of the scalar $foo - equivalent to (??{ quotemeta $foo }).

(?Q@foo) Quotes each item in a list (as above) this is equivalent to (?:(??{ join ('|', map quotemeta, @foo)})).

In this syntax the Q is used as it represents a more inteligent \Quot\E.

It is recognised that (?Q$foo) is equivalent to \Q$foo\E, but it does not mean that this is a bad idea to add this at the same time as (?Q@foo) for reasons of symetry and perl DWIM.

It is recognised the (?Q might be reserved for control of a hypothetical Q flag, but this does feel "appropriate" as its about \Quoting.

Comments

Hugo: > (?@foo) and (?Q@foo) are both things I've wanted before now. I'm > not sure if this is the right syntax, particularly if RFC 112 is > adopted: it would be confusing to have (?@foo) to have so > different a meaning from (?$foo=...), and even more so if the > latter is ever extended to allow (?@foo=...). > I see no reason that implementation should cause any problems > since this is purely a regexp-compile time issue.

Me: I cant see any reasonable meaning to (?@foo=...) this seams an appropriate syntax, but I am open for others to be suggested.

CHANGES

V1 of this RFC had three ideas, one has been dropped, the other is now part of RFC 198.

V2 Expands the list expansion and quoting with quoting of scalars and Implemention issues.

V3 In an error what should have been 165 V2 was issued as 166 V2 so this is V3 with a change in (?Q$foo). This is in a pre-frozen state.

V4 Added a couple of minor changes from Hugo and frozen.

MIGRATION

As (?@foo) and (?Q...) these are additions with out any compatibility issues.

The option of just @foo for list exansion, might represent a small problem if people already use the construct.

IMPLENTATION

Both of these are changes are regex compile time issues.

Generating lists from arrays almost works by localising $" as '|' for the regex and just using @foo.

MJD has demonstrated implementing (?@foo) as (?\@foo) by means of an overload of regexes, this slight change was necessary because of the expansion of @foo - see below.

Both of these changes are currently affected by the expansion of variables in the regex before the regex compiler gets to work on the regex. This problem also affects several other RFCs. The expansion of variables in regexes needs for these (and other RFCs) to be driven from within the regex compiler so that the regex can expand as and where appropriate. Changing this should not affect any existing behaviour.

REFERENCES

RFC 198: Boolean Regexes