[% setvar title Single quotes don't interpolate \' and \\ %]

This file is part of the Perl 6 Archive

Note: these documents may be out of date. Do not use as reference!

To see what is currently happening visit http://www.perl6.org/

TITLE

Single quotes don't interpolate \' and \\

VERSION

  Maintainer: Nicholas Clark <nick@talking.bollo.cx>
  Date: 28 Sep 2000
  Last Updated: 30 Sep 2000
  Mailing List: perl6-language@perl.org
  Number: 328
  Version: 3
  Status: Frozen

CHANGES

Reissued on perl6-language@perl.org - I goofed the list.

Clarified the description slightly; by single quoted string I mean '' and q()

Updated discussion section

Frozen not withdrawn (see discussion section)

DISCUSSION

I'm in two minds as to whether to freeze or retract this RFC

Reaction was strongly polarised; three strongly against and one strongly for. People valued their ability to use single quotes to easily make strings containing single quotes. Michael Fowler expresses

    Whew.  Disallowing escapes in a single-quote string does not make easy
    things easier and hard things possible.

    I'm not arguing that we should keep it simply because people are used
    to it, but instead we should keep it because it's useful.

My view was that the majority are against the change, but views were from existing perl users [who do you expect as the majority on perl6 lists? :-)]. The change would penalise existing perl users, but benefit new perl users (and presumably people teaching perl).

However, I'm wrong on that. Hildo Biersma states

   Now, I have been teaching perl for a number of years, and nobody's ever
   had trouble with understanding how single quotes and the two escapes
   work.  Plenty of people find double-quotes either too powerful or too 
   limited (see the various RFCs), but I think single quotes are fine

However, there was no comment on the secondary issue of how single quotes treat unrecognised escapes. To me the following seems wrong:

   $  perl -lwe "print q(Quoted \( \\ \) Not \' \t \z)"
   Quoted ( \ ) Not \' \t \z

And from the archives:

   Does it strike anyone else as odd that 'foo\\bar' eq 'foo\bar'?         
 

(Steve Fink, www.mail-archive.com

I conclude

hence escaping as is should remain, but it would be possible to change the unrecognised escape behaviour (say \z maps to z like a "" string) without causing pain, if this change were deemed sensible. My view is that (2) should be considered, hence I freeze rather than withdraw the RFC.

ABSTRACT

Remove all interpolation within single quotes and the q() operator, to make single quotes 100% shell-like. \ rather than \\ gives a single backslash; use double quotes or q() if you need a single quote in your string.

DESCRIPTION

Camel III (page 7) says "Double quotation marks (double quotes) do variable interpolation and backslash interpolation while single quotes suppress interpolation." Page 60 qualifies this with "except for \' and \\".

In perl single quotes are used to generate strings. Double quotes also generate strings.

In C single quotes are used to make character constants. Double quotes are used to make string constants. Backslash interpretation is performed in single quotes in C. While multi-character constants are allowed by C, they are strongly discouraged as they are non-portable, and a character constant in C is a type distinct from a string constant. Hence double quotes and single quotes signify different things.

In shell, single quotes are used to make strings. Double quotes also make strings. Within single quotes backslashes are ordinary characters, and do not quote anything. As one can't quote a ' with a \ there is no way to interpolate a single quote within a single quoted string, but a workaround such as 'don'\''t' relying on the concatenation of 'don' \' and 't' achieves the desired results.

Hence perl's single quoted strings are analogous to shell's single quoted strings, not C's. However, they're not identical, as perl allows \\ to mean an embedded \, \' to mean an embedded '.

This RFC argues that the exception is confusing and proposes to remove it. This makes perl more regular in shell terms, and slightly more easy to learn for the shell programmer.

It also makes perl internally simpler more regular. Currently the behaviour for q() strings is that \( \) and \\ map to 1 character, \? for all other ? maps to 2 characters. qq() differs as \? maps to 1 character both when ? is recognised as a backslash escapes, and when it is unrecognised. A further irregularity is that currently single quoted here docs don't interpolate \\ or \'. The consequence of this is that currently

	'foo\\bar' eq 'foo\bar'

which sure looks odd.

With this RFC it is proposed that in a single quoted string and the q() operator \ is not special. Hence \? always maps to 2 characters (\ then ?) unless ? is the closing terminator, in which case the string terminates with that \ . Single quoted strings behave like single quoted here docs, and like shell single quoted strings.

You don't lose any functionality, as

   'don\'t implement this RFC, the benefits don\'t outweigh the confusion'

can still be written

   q(don't implement this RFC, the benefits don't outweigh the confusion)

which is actually less typing.

IMPLEMENTATION

Modify the tokeniser/lexer not to treat \ as special, hence the first end delimiter ends the string.

For 5.7's toke.c this doesn't appear that simple. it looks like modifications would be needed to S_tokeq, scan_str and Perl_yylex (for a quoted string at the start of curlies). There are probably more; the code that makes single quoted strings interpolate \' and \\ appear to be deeply ingrained into the core.

The perl5 to perl6 converter would need to convert single quoted strings and q() operators containing \' to the shortest (clearest?) equivalent of:

Single quoted strings containing \\ but no quoting of delimiters would need to have \\ converted to \

REFERENCES

Camel III - Programming Perl (3rd Edition)

perlop manpage for interpolation

RFC 226: Selective interpolation in single quotish context.

I believe that Larry Wall once made a comment about \' and \\ in single quoted strings being a mistake, but I can't find any reference. The idea certainly isn't mine, but I feel it worthy of consideration, even if considered opinion is that any gain is to small to outweigh the upheaval.