[% setvar title Organization and Rationalization of Perl State Variables %]

This file is part of the Perl 6 Archive

Note: these documents may be out of date. Do not use as reference!

To see what is currently happening visit http://www.perl6.org/

TITLE

Organization and Rationalization of Perl State Variables

VERSION

  Maintainer: Steve Simmons <scs@ans.net>
  Date: 3 Aug 2000
  Mailing List: perl6-language@perl.org
  Number: 17
  Version: 2
  Status: Developing

ABSTRACT

Perl currently contains a large number of state and status variables with names that resemble line noise (henceforth called $[linenoise] variables). Some of these are (or should be) deprecated, and the naming methods for the rest are obscure. Since we are (potentially) adding, removing, and changing the functionality of these variables with Perl6, we should seize the opportunity to rationalize the names and organization of these variables as well. These variables need to be made available with mnemonic names, categorized by module, be capable of introspective use, and proper preparation made for deprecation and translation.

DESCRIPTION

Perl allows for the runtime examination and setting of a number of variables. These existing $[linenoise] variables are horrible names and need cleanup, re-organization, and syntactic sugar. Different variables have different problems, usually one or more of the following:

In the pre-RFC discussion of this issue, it was also pointed out that these variables are hard to deprecate without nagging the crap out of users running the programs. The proposed solution was broadly applicable, and has been spun off into RFC 3.

The use of the English module is an attempt to solve the anti-mnemonic features of these variables. A better solution is to do it right in the first place, with a number of attendant wins:

In addition, many features which are now (re)set by other calls should set appropriate state variables as well. Thus a perl script which contains:

    use strict foo;

might set a var $PERL_STRICT{foo}, and so forth (this is probably a poor example).

Credit where it is due: the idea of putting related values together into an appropriately tagged hash is shamelessly ripped off from common tcl usage.

Caveat - Global State Variables Are Dangerous

Having the setting of simple variables modify the function of a broad set of things is an inherently dangerous and inelegant way of doing things. Mark-Jason Dominus <mjd@plover.com> has proposed that they simply be removed whenever possible. I tend to agree with his arguement, and join in urging that the problems he addresses in message 20000805171023.24408.qmail@plover.com be addressed by the core teams working in the areas that use such variables. The thread started my Mark-Jason in perl6-language has a good discussion of the issues.

Notwithstanding, should the core team continue to allow global variables for some purposes, the names and categorization should be improved.

Advantages and Non-Loses

Clean Backwards Compatibility

To promote backward compatibility, one could write a use antiEnglish module which would alias the old names to the new ones. That Would Be Wrong, but someone will probably do it - so it may as well be us. Obviously we cannot provide backwards compatibility for a variable whose meaning has changed or which has vanished, but most of the rest can be captured cleanly.

Promoting Removable Core Modules

It has been strongly proposed that `core perl' be broken down internally into a number of modules such that one could build smaller or larger custom perls. This feature would ease that work, and possibly set a standard for how well-behaved non-core modules should implement such things.

Provide Possible Guidelines To Core-able Modules

The discussion of removable core modules has strongly implied (sometimes explicitly stated) that sites could take modules which are not currently in the core and move them there. Having a standard which those modules could follow for variable setting and exposure would be a major win.

Disadvantages

Backwards Compatibility

Existing perl code which makes use of $[linenoise] variables which had not changed meaning in perl6 will fail if (as proposed) the variables are completely replaced by the names proposed here. This is, at best, a minor objection because

Increased Typing Considered Painful

A number of people have expressed great joy at the brevity of the current names.

Loss of Distinctiveness

Currently when one sees a $[linenoise] variable, they notice its specialness and have a visual clue to take care. $[linenoise] stands out. Since some of these variables have wide-reaching side-effects, any solution should try to maintain that distinctiveness.

Other Possible Features

It has been suggested by Alan Burlison <Alan.Burlison@uk.sun.com> that this allow for localizable settings. Thus module A might turns warnings off when its features are in use, while module B is unaffected by the setting done by module A. This is a change in the functionality of such settings, and has the potential for broadly changing what happens in perl. I suggest that this issue is actually independent of how the variables are named, and should be taken to another RFC if anyone is interested.

IMPLEMENTATION

Internal Implementation

The internal representation of these is largely irrelevant(!). This RFC prescribes what the external interface looks like; the implementation team should select whatever mechanism they prefer for internal use. It's probably a maintenance win if the same is used, but the proposers don't intend to dictate to the developers.

External Implementation - Proposals

Overall

Variables should be sorted into functional areas which may or may not have a one-to-one correspondence with internal (core) There have been four suggestions for how the variables should be named.

In the examples below, I have capitalized the higher-level portions of the names. This capitalization is not a requirement of the RFC, and is done purely to make the new names stand out in this document.

Well-Named Global Hashes And Keys

For each collection of variables, a well-named pseudo-hash with well-named keys:

  $PERL_CORE{warnings}          vs      $^W
  $PERL_CORE{version}           vs      $^V
  $PERL_FORMATS{name}           vs      $^
  $PERL_FORMATS{lines_left}     vs      $-
  $PSEUDO_HASHES{strict}        vs      (none)

An additional variable should be provided,

  $PERL_CORE{variables}

which contains a list of all the settable hash names (eg,

  $PERL_CORE{variables} = qw(PERL_CORE PERL_FORMATS PSEUDO_HASHES ...)

Advantages: Only a single point in the namespace is used for each variable. The hashes can be handed around en masse efficiently via references. Pseudo-hash member access is efficient. Names are free from module dependency. Introspection with hashes is more powerful (see below).

Disadvantages: Pseudo-hashes are new to most programmers.

Well-Named Module Variable Sets

This has an almost one-to-one naming match to the first suggestion, but used module naming:

  $PERL::CORE::Warnings         vs      $^W
  $PERL::CORE::Version          vs      $^V
  $PERL::FORMATS::Name          vs      $^
  $PERL::FORMATS::LinesLeft     vs      $-
  $PSEUDOHASH::CONTROL::Strict       vs      (none)

Advantages: Looks like individual variables again. Modules could be moved (compiled) in and out of the perl core and, so long as a script did the appropriate use/require statement, the script would require no other changes.

Disadvantages: Locks variables into given modules. The second-level names may become difficult to do sensibly.

Well-Named Per-Module Hashes And Keys

This is a hybrid of the first two:

  $PERL::CORE{warnings}          vs      $^W
  $PERL::CORE{version}           vs      $^V
  $PERL::FORMATS{name}           vs      $^
  $PERL::FORMATS{lines_left}     vs      $-
  $PSEUDOHASH::CONTROL{Strict}   (none)

Advantages: Looks like individual variables again. Modules could be moved (compiled) in and out of the perl core and, so long as a script did the appropriate use/require statement, the script would require no other changes. Introspection and variable discovery via keys %PERL::CORE is still a win.

Disadvantages: Locks variables into given modules. The second-level names may become difficult to do sensibly. Introspection becomes more difficult because one must find the second-level name(s) for each class of item (eg, PERL::CORE and PERL::FORMATS). Sometimes there should not be More Than One Way To Do It - a single mechanism makes it easier to notice, find, use, and understand such var.

Introspection With Hashes

The use of hashes and pseudo-hashes leads to a straightforward and `natural' mechanism by which programmers can discover all the relevant variables for a given item:

   foreach my $key ( %PERL_CORE ) {
      print "\%PERL_CORE{$key} is $PERL_CORE{$key}\n";
   }

Summary

The RFC maintainer thinks the first suggestion, Well-named Global Hashes and Keys, is the best choice. While it has potential problems for module removal should a hash be shared between several modules, these are (IMHO) worth the flexibility of not locking a given hash to a given module and providing a single, consistent mechanism programmers would use to obtain value settings.

The community has not (as of version 1.1) expressed a strong preference in either direction.

Value Protection

A value can be set and reset, but it's existence should be protected. Thus one could set the variables, but not undefine or delete them. If hashes are chosen, the user should not be allowed to add or remove key/value pairs from the hash.

Further, values such as $PERL_CORE{version} should be read-only.

Backwards Compatibility Issues and Suggestions

Backwards compatibility is an issue in two ways:

=over4 
  • Moving old scripts to perl6
  • Moving old coders to perl6
  • There are two solution sets for this, both of which should be implemented as part of perl6

    Variable Translation

    The proposed script translator should catch all uses of pre-perl6 $[linenoise] variables and translate them to the new names according to the guidelines below.

    Variable Remapping

    An antiEnglish module could be provided. Like the current English module, this would attempt to map the new names into the old $[linenoise] variables for those programmers absolutely addicted to the old names. It should handle the variables according to the guidelines below.

    Permanent Aliasing?

    Several comments have been made (including one claimed to be second-hand from Larry Wall) that they just don't like typing long variable names. If this is a sufficently strong problem, the contributors to this RFC would have no objection to having both variable namesets present simultaneously.

    This would have the advantage that the variables would still be reorganized and grouped, and some additional introspection possible, but the folks who are really married to the ultra-short names would still have them available with no typing overhead.

    Guidelines for variable translation and remapping

    The perl6 conversion script should silently translate old variable names to the new names when the meaning of the variable is essentially unchanged.

    The antiEnglish module should provide the old name as an alias or map to the new name.

    Variables with changed meaning

    The perl6 conversion script should translate old variable names to the new names when there seems to be a reasonable but not perfect mapping between the meanings of variables. A one-time warning should be issued at compile time for each use of such variables, similar to the manner in which the current perl -w warns of first use of undeclared variables. A comment should be inserted into the code above the use of the variable saying something like:

        # Comment inserted by PERL6_trans on <date>
        # Variable <foo> does not exist in perl6.  We have translated
        # it to <bar> below, but <foo> and <bar> are not completely
        # equivalent.  Please check your usage and make the appropriate
        # changes.

    In the best of all worlds, the comment might give hints as to what to think about when changing that var.

    The antiEnglish module should provide no mapping for these variables.

    Variables which have been removed

    The translator should leave removed variables which have been removed in place, but a well-tagged warn statement should be placed above

    The antiEnglish module should provide no mapping for these variables.

    New variables

    Should the translator notice use of an existing variable with the same name, it should ?print a warning and stop? ?print a warning and go one? ?automaticly rename to prior var to l_? TBD.

    The antiEnglish module should provide no mapping for these variables.

    Deprecation Warnings In Usage

    Programmers using hand-translated scripts or who attempt to run untranslated scripts with perl6 should get compile-time warnings and errors, depending on severity of mis-use.

    At a minimum, use strict and perl -w should warn of the use of deprecated variable names. That deprecation warning should be specific, not simply var $foo is deprecated. Messages should identify the old var, new var, and at least imply an action. Some possible samples include:

      changing var $foo no longer has any effect, see <doc module>
    
      the value of var $bar no longer has any meaning, see <doc module>
    
      var $baz has been replaced by $FOO_BAR{baz}

    and so forth.

    In addition, it should be possible to have the programmer detect and control these messages on a case by case basis. RFC 3 proposes a mechanism by which programmers could take more direct control of this; should that RFC be accepted then that mechanism should be used. Should it fail, we will make a more specific recommendation here.

    CONTRIBUTIONS

    Contributions ranging from voluminous commentary to pretty decent wisecracks have been received from:

        Ted Ashton <ashted@southern.edu>
        "J. David Blackstone" <jdavidb@dfw.net>
        Corwin Brust <cbrust@alldata.net>
        Alan Burlison <Alan.Burlison@uk.sun.com>
        Piers Cawley <pdcawley@bofh.org.uk>
        Tom Christiansen <tchrist@chthon.perl.com>
        Mark-Jason Dominus <mjd@plover.com>
        Ken Fox <kfox@vulpes.com>
        Chaim Frenkel <chaimf@pobox.com>
        Jarkko Hietaniemi <jhi@iki.fi>
        Tim Jenness <timj@jach.hawaii.edu>
        Bart Lateur <bart.lateur@skynet.be>
        Peter Scott <Peter@PSDT.com>
        William Setzer <William_Setzer@ncsu.edu>
        Dan Sugalski <dan@sidhe.org>
        "Bryan C. Warnock" <bwarnock@gtemail.net>
        Nathan Wiger <nate@wiger.org>

    and others who I probably missed. In addition, it's been pointed out that this suggestion was raised in the past; unfortunately the fingerprints have long since worn away.

    REFERENCES

    RFC 3 on Run-Time Error Message Control