[% setvar title Keyed arrays %]

This file is part of the Perl 6 Archive

Note: these documents may be out of date. Do not use as reference!

To see what is currently happening visit http://www.perl6.org/

TITLE

Keyed arrays

VERSION

  Maintainer: Glenn Linderman <glenn@linderman.com>
  Date: 20 Sep 2000
  Last Modified: 28 Sep 2000
  Mailing List: perl6-language@perl.org
  Number: 268
  Version: 3
  Status: Withdrawn

ABSTRACT

The idea here is to merge the concepts of array and hash, not to replace either one, but to provide some middle ground with data structures having some characteristics of each. The resultant data structure is based on array, may be more costly than hash, for some operations, may be more efficient than hashes for some operations, and may provide a more compatible solution for some types of problems than other RFCs.

Notes on withdrawal

There are some nice ideas in here, if only they could work. At least it provided a bit of inspiration for RFC 273, so it wasn't a total waste of time.

CHANGES

In Version 2:

Clarify wording in various places. Prohibit use of splice on keyed arrays that actually contain keys. Change the default array type to :nokey, since splice is prohibited on keyed arrays.

DESCRIPTION

These are not pseudo-hashes. Michael Schwern, author of RFC 241: Pseudo-hashes must die! after some lengthy discussion, said: <quote> Ya know, I'm going to say that $aref->[string] might be made to work where pseudo-hashes failed. </quote>. However, that is not intended to mean that Mr. Schwern endorses this proposal, but I did take it as encouragement to develop this RFC to squeak in before the deadline and hope that it lends just a bit credence to the proposal. That said...

I think that Mr. Schwern's comment does imply that a proposal along these lines might be able to solve problems that pseudo-hashes were invented to solve, in ways that might be less problematical to implement; this would allow pseudo-hashes to not need to coexist with keyed arrays, so the proposal doesn't discuss such coexistance. See also RFC 241.

Definition syntax

An array could have an optional namespace. Probably best to call these keys, because it really is the same concept as hash keys--we look up values using them. Probably there could be several flavors, as described below. I'm not yet certain if all these flavors are useful, or if all these flavors, even if useful, are worth the pain of implementation, since I don't know the cost of implementation. There may be flavors I haven't thought of yet, that would be worth implementing. All but the first are generically called keyed arrays. Here's my current concept list:

    my/our @array :nokey;
    my/our @array :key;
    my/our @array :initialkey ( key0, key1, ... );
    my/our @array :keyonly;
    my/our @array :hashsyntax;

With the :nokey attribute, we would have a familiar perl<6 array. Use numeric indexes, and [] indexing syntax. This would be the default, if no attribute is specified, for compatibility.

With the :key attribute, it would be allowed, but not required, for keys to be added to point to particular array elements. Numeric indexing could still be used for speed. Not all array elements would necessarily have keys.

The :initialkey variation would specify in the definition a list of keys which would correlate to the array indexes 0, 1, ...

The :keyonly variation would be less efficient, and would require use of keys for lookup. Of course, the compiler could translate fixed keys into numeric indexes under the covers for performance.

The :hashsyntax variation is identical to the :keyonly variation, except for the syntax, which is like hashes instead of like arrays. This variation could unify the concepts of hash and array. The definition of a :hashsyntax array should probably reserve a special pointer in the symbol table so that the similarly named hash would be automatically defined for access, but would actually refer to the :hashsyntax keyed array. In other words, the definition of

    my/our @array :hashsyntax;

would hide the definition of %array in the same way that

    my/our %array

would hide a prior definition of %array. And references to %array would thenceforth actually be references to the keyed array @array.

Reference syntax

The syntaxes

    $foo['element']
    $foo[element]

or for :hashsyntax arrays, either the above or

    $foo{'element'}
    $foo{element}

would be interpreted as a reference to @foo. If the namespace for @foo contains 'element', that member of @foo is the interpretation. If the namespace for @foo does not contain 'element', then 'element' is added to the namespace for @foo, the size of @foo is increased by 1, and the member 'element' refers to the newly added item in @foo.

So, starting with

   my @foo:key; # empty array
   $foo ['month'] = 10;  #  $#foo == 1, $foo[0] == 10
   $foo ['day'] = 20;   # $#foo == 2, $foo [1] == 20
   $foo ['year'] = 30;   # $#foo = 3, $foo [2] == 30

We achieve an array with 3 elements. There is a clear parallel between this and

   my %foo;
   $foo{'month'} = 10;
   $foo{'day'} = 20;
   $foo{'year'} = 30;

However, the lookups for @foo are done at compile time, the lookups for %foo are done at runtime.

For :key and :initialkey arrays, the syntax

    $foo[$bar]

would inspect $bar to determine if it is convertable to numeric. If it is, the value is used as the numeric index of the array. If it is not, it is treated as a key for the array, and is looked up in the namespace of the array.

For :keyonly and :hashsyntax arrays, all indexes are considered to be string keys just like hash keys, requiring lookup in the namespace of the array.

Perl uses some heuristic to decide whether a bareword within the {} of a hash key reference is a function or a string, the same heuristic should be applied within [] for keyed arrays.

Push, pop, shift, and unshift are not valid for :keyonly and :hashsyntax arrays, because they access data by index, which is prohibited for those types of keyed arrays. Splice is prohibited on all keyed arrays, to allow compile time name lookup.

The operations keys, values, and each would make sense for keyed arrays. Their syntax should be extended to accept keyed arrays as parameters, and the semantics should be similar to that for hashes. It should be noted that values is not necessarily the same order as the original keyed array, since it would iterate via the keys. For :key and :initialkey arrays, it would not even necessarily be the complete array, because not all array entries would necessarily have keys.

Usages

Garrulous builtins

(I cloned this heading from RFC 259, thanks Damian) (I cloned the list of functions from RFC 37, thanks Jarkko) (I cloned the list of standardized key names from RFC 259, thanks again, Damian)

The idea here is that some functions return an array of values that are hard to keep track of. It would be nice to retain compatibility, but also be able to access the members of the array by name, rather than number. Since these arrays are of fixed size, and known values, we can apply the :initialkeys variation to these functions. The functions remain fully compatible with current definitions and usage of their results, however, new code could be written using the names instead of the numbers to access the results. One example, and then the cloned lists of details.

   my ( @stat_array ) = stat ( $filename );
   print "File $filename has a size of $stat_array[size] bytes.\n";

List of garrulous builtins

        caller [1]
        getgrent
        getgrnam
        getgrgid
        gethostbyaddr
        gethostbyname
        gethostent
        getnetbyaddr
        getnetbyname
        getnetent
        getprotobyname
        getprotobynumber
        getprotoent
        getpwent
        getpwnam
        getpwuid
        getservbyname
        getservbyport
        getservent
        gmtime
        localtime
        stat

Some changes to this list may be necessitated by other changes to Perl6.

Standardized keys

The standardized keys for these functions would be: