[% setvar title Organization and Rationalization of Perl State Variables %]
Note: these documents may be out of date. Do not use as reference! |
To see what is currently happening visit http://www.perl6.org/
Organization and Rationalization of Perl State Variables
Maintainer: Steve Simmons <scs@ans.net> Date: 3 Aug 2000 Mailing List: perl6-language@perl.org Number: 17 Version: 2 Status: Developing
Perl currently contains a large number of state and status variables
with names that resemble line noise (henceforth called $[linenoise]
variables).
Some of these are (or should be) deprecated, and the naming methods for
the rest are obscure. Since we are (potentially) adding, removing, and
changing the functionality of these variables with Perl6, we should seize the
opportunity to rationalize the names and organization of these variables
as well. These variables need to be made available with mnemonic names,
categorized by module, be capable of introspective use, and proper
preparation made for deprecation and translation.
Perl
allows for the runtime examination and setting of a number of
variables. These existing $[linenoise]
variables are horrible names and
need cleanup, re-organization, and syntactic sugar. Different variables
have different problems, usually one or more of the following:
In the pre-RFC discussion of this issue, it was also pointed out that these variables are hard to deprecate without nagging the crap out of users running the programs. The proposed solution was broadly applicable, and has been spun off into RFC 3.
The use of the English
module is an attempt to solve the anti-mnemonic
features of these variables. A better solution is to do it right in
the first place, with a number of attendant wins:
$[linenoise]
variable proliferationIn addition, many features which are now (re)set by other calls should
set appropriate state variables as well. Thus a perl
script which
contains:
use strict foo;
might set a var $PERL_STRICT{foo}
, and so forth (this is probably
a poor example).
Credit where it is due: the idea of putting related values together into
an appropriately tagged hash is shamelessly ripped off from common tcl
usage.
Having the setting of simple variables modify the function of a
broad set of things is an inherently dangerous and inelegant way of
doing things. Mark-Jason Dominus <mjd@plover.com> has proposed that
they simply be removed whenever possible. I tend to agree with his
arguement, and join in urging that the problems he addresses in message
20000805171023.24408.qmail@plover.com
be addressed by the core teams
working in the areas that use such variables. The thread started my
Mark-Jason in perl6-language
has a good discussion of the issues.
Notwithstanding, should the core team continue to allow global variables for some purposes, the names and categorization should be improved.
To promote backward compatibility, one could write a use antiEnglish
module which would alias the old names to the new ones. That Would Be
Wrong, but someone will probably do it - so it may as well be us.
Obviously we cannot provide backwards compatibility for a variable
whose meaning has changed or which has vanished, but most of the
rest can be captured cleanly.
It has been strongly proposed that `core perl
' be broken down internally
into a number of modules such that one could build smaller or larger
custom perls. This feature would ease that work, and possibly set a
standard for how well-behaved non-core modules should implement such
things.
The discussion of removable core modules has strongly implied (sometimes explicitly stated) that sites could take modules which are not currently in the core and move them there. Having a standard which those modules could follow for variable setting and exposure would be a major win.
Existing perl
code which makes use of $[linenoise]
variables
which had not changed meaning in perl6
will fail if (as proposed)
the variables are completely replaced by the names proposed here.
This is, at best, a minor objection because
A number of people have expressed great joy at the brevity of the current names.
Currently when one sees a $[linenoise]
variable,
they notice its specialness and have a visual clue to take care.
$[linenoise]
stands out.
Since some of these variables have wide-reaching side-effects,
any solution should try to maintain that distinctiveness.
perl5
to perl6
translation tool should catch this
and do the appropriate renaming.It has been suggested by Alan Burlison <Alan.Burlison@uk.sun.com>
that this allow for localizable settings. Thus module A might
turns warnings off when its features are in use, while module
B is unaffected by the setting done by module A. This is a change
in the functionality of such settings, and has the potential for
broadly changing what happens in perl
. I suggest that this
issue is actually independent of how the variables are named,
and should be taken to another RFC if anyone is interested.
The internal representation of these is largely irrelevant(!). This RFC prescribes what the external interface looks like; the implementation team should select whatever mechanism they prefer for internal use. It's probably a maintenance win if the same is used, but the proposers don't intend to dictate to the developers.
Variables should be sorted into functional areas which may or may not have a one-to-one correspondence with internal (core) There have been four suggestions for how the variables should be named.
In the examples below, I have capitalized the higher-level portions of the names. This capitalization is not a requirement of the RFC, and is done purely to make the new names stand out in this document.
For each collection of variables, a well-named pseudo-hash with well-named keys:
$PERL_CORE{warnings} vs $^W $PERL_CORE{version} vs $^V $PERL_FORMATS{name} vs $^ $PERL_FORMATS{lines_left} vs $- $PSEUDO_HASHES{strict} vs (none)
An additional variable should be provided,
$PERL_CORE{variables}
which contains a list of all the settable hash names (eg,
$PERL_CORE{variables} = qw(PERL_CORE PERL_FORMATS PSEUDO_HASHES ...)
Advantages: Only a single point in the namespace is used for each variable. The hashes can be handed around en masse efficiently via references. Pseudo-hash member access is efficient. Names are free from module dependency. Introspection with hashes is more powerful (see below).
Disadvantages: Pseudo-hashes are new to most programmers.
This has an almost one-to-one naming match to the first suggestion, but used module naming:
$PERL::CORE::Warnings vs $^W $PERL::CORE::Version vs $^V $PERL::FORMATS::Name vs $^ $PERL::FORMATS::LinesLeft vs $- $PSEUDOHASH::CONTROL::Strict vs (none)
Advantages: Looks like individual variables again. Modules could
be moved (compiled) in and out of the perl core and, so long as
a script did the appropriate use/require
statement, the script
would require no other changes.
Disadvantages: Locks variables into given modules. The second-level names may become difficult to do sensibly.
This is a hybrid of the first two:
$PERL::CORE{warnings} vs $^W $PERL::CORE{version} vs $^V $PERL::FORMATS{name} vs $^ $PERL::FORMATS{lines_left} vs $- $PSEUDOHASH::CONTROL{Strict} (none)
Advantages: Looks like individual variables again. Modules could
be moved (compiled) in and out of the perl core and, so long as
a script did the appropriate use/require
statement, the script
would require no other changes. Introspection and variable
discovery via keys %PERL::CORE
is still a win.
Disadvantages: Locks variables into given modules. The second-level names may become difficult to do sensibly. Introspection becomes more difficult because one must find the second-level name(s) for each class of item (eg, PERL::CORE and PERL::FORMATS). Sometimes there should not be More Than One Way To Do It - a single mechanism makes it easier to notice, find, use, and understand such var.
The use of hashes and pseudo-hashes leads to a straightforward and `natural' mechanism by which programmers can discover all the relevant variables for a given item:
foreach my $key ( %PERL_CORE ) { print "\%PERL_CORE{$key} is $PERL_CORE{$key}\n"; }
The RFC maintainer thinks the first suggestion, Well-named Global Hashes and Keys, is the best choice. While it has potential problems for module removal should a hash be shared between several modules, these are (IMHO) worth the flexibility of not locking a given hash to a given module and providing a single, consistent mechanism programmers would use to obtain value settings.
The community has not (as of version 1.1) expressed a strong preference in either direction.
A value can be set and reset, but it's existence should be protected. Thus one could set the variables, but not undefine or delete them. If hashes are chosen, the user should not be allowed to add or remove key/value pairs from the hash.
Further, values such as $PERL_CORE{version}
should be read-only.
Backwards compatibility is an issue in two ways:
=over4
perl6
perl6
There are two solution sets for this, both of which should be implemented
as part of perl6
The proposed script translator should catch all uses of pre-perl6
$[linenoise]
variables and translate them to the new names according
to the guidelines below.
An antiEnglish
module could be provided.
Like the current English
module, this would attempt to map the new names
into the old $[linenoise]
variables for those programmers absolutely
addicted to the old names.
It should handle the variables according to the guidelines below.
Several comments have been made (including one claimed to be second-hand from Larry Wall) that they just don't like typing long variable names. If this is a sufficently strong problem, the contributors to this RFC would have no objection to having both variable namesets present simultaneously.
This would have the advantage that the variables would still be reorganized and grouped, and some additional introspection possible, but the folks who are really married to the ultra-short names would still have them available with no typing overhead.
The perl6
conversion script should silently translate old
variable names to the new names when the meaning of the variable
is essentially unchanged.
The antiEnglish
module should provide the old name as an
alias or map to the new name.
The perl6
conversion script should translate old
variable names to the new names when there seems to be a reasonable
but not perfect mapping between the meanings of variables.
A one-time warning should be issued at compile time for each
use of such variables, similar to the manner in which the current
perl -w
warns of first use of undeclared variables.
A comment should be inserted into the code above the use of the
variable saying something like:
# Comment inserted by PERL6_trans on <date> # Variable <foo> does not exist in perl6. We have translated # it to <bar> below, but <foo> and <bar> are not completely # equivalent. Please check your usage and make the appropriate # changes.
In the best of all worlds, the comment might give hints as to what to think about when changing that var.
The antiEnglish
module should provide no mapping for these
variables.
The translator should leave removed variables which have been removed
in place, but a well-tagged warn
statement should be placed above
The antiEnglish
module should provide no mapping for these
variables.
Should the translator notice use of an existing variable with the same name, it should ?print a warning and stop? ?print a warning and go one? ?automaticly rename to prior var to l_? TBD.
The antiEnglish
module should provide no mapping for these
variables.
Programmers using hand-translated scripts or who attempt to run
untranslated scripts with perl6
should get compile-time warnings
and errors, depending on severity of mis-use.
At a minimum, use strict
and perl -w
should warn of the
use of deprecated variable names. That deprecation warning should
be specific, not simply var $foo is deprecated
. Messages
should identify the old var, new var, and at least imply an action.
Some possible samples include:
changing var $foo no longer has any effect, see <doc module> the value of var $bar no longer has any meaning, see <doc module> var $baz has been replaced by $FOO_BAR{baz}
and so forth.
In addition, it should be possible to have the programmer detect and control these messages on a case by case basis. RFC 3 proposes a mechanism by which programmers could take more direct control of this; should that RFC be accepted then that mechanism should be used. Should it fail, we will make a more specific recommendation here.
Contributions ranging from voluminous commentary to pretty decent wisecracks have been received from:
Ted Ashton <ashted@southern.edu> "J. David Blackstone" <jdavidb@dfw.net> Corwin Brust <cbrust@alldata.net> Alan Burlison <Alan.Burlison@uk.sun.com> Piers Cawley <pdcawley@bofh.org.uk> Tom Christiansen <tchrist@chthon.perl.com> Mark-Jason Dominus <mjd@plover.com> Ken Fox <kfox@vulpes.com> Chaim Frenkel <chaimf@pobox.com> Jarkko Hietaniemi <jhi@iki.fi> Tim Jenness <timj@jach.hawaii.edu> Bart Lateur <bart.lateur@skynet.be> Peter Scott <Peter@PSDT.com> William Setzer <William_Setzer@ncsu.edu> Dan Sugalski <dan@sidhe.org> "Bryan C. Warnock" <bwarnock@gtemail.net> Nathan Wiger <nate@wiger.org>
and others who I probably missed. In addition, it's been pointed out that this suggestion was raised in the past; unfortunately the fingerprints have long since worn away.
RFC 3 on Run-Time Error Message Control