[% setvar title IO: Standardization of Perl IO Functions to use Indirect Objects %]

This file is part of the Perl 6 Archive

Note: these documents may be out of date. Do not use as reference!

To see what is currently happening visit http://www.perl6.org/

TITLE

IO: Standardization of Perl IO Functions to use Indirect Objects

VERSION

  Maintainer: Nathan Wiger <nate@wiger.org>
  Date: 15 Sep 2000
  Last Modified: 26 Sep 2000
  Mailing List: perl6-language-io@perl.org
  Number: 239
  Version: 2
  Status: Frozen

ABSTRACT

Currently, Perl IO functions follow a C-like style, twiddling values passed to them and then returning 1 or 0.

This RFC takes after RFC 14's modifications to open() and proposes similar modifications to other IO operations, in an attempt to make them both more internally consistent and also more flexible. These modifications allow increased modularity and function namespace as well. In most cases the changes are relatively minor, simply requiring the user to drop the "," after the first argument.

DESCRIPTION

I'm running short on time here, so here goes. The following functions should be modified into the new syntaxes shown below:

  $FH      =  open $file, [@args];
  $ret     =  seek $FH $pos;                        # $FH->seek
  $ret     =  read $FH $scalar, $len, $offset;      # $FH->read
  $ret     =  tell $FH;                             # $FH->tell
  $ret     =  ioctl $FH $fun, $scalar;              # $FH->ioctl
  $ret     =  flock $FH $op;                        # $FH->flock
  $ret     =  fnctl $FH $func, $scalar;             # $FH->fcntl

  $DH      =  open dir $dir;                        # dir->open
  $ret     =  seek $DH $pos;                        # $DH->seek
  $ret     =  tell $DH;                             # $DH->tell
  $ret     =  rewind $DH;                           # $DH->rewind

  $FH      =  sysopen  $file, $mode, $mask;         # "open sys"?
  $ret     =  sysread  $FH $scalar, $len, $offset;  # $FH->sysread
  $ret     =  syswrite $FH $scalar, $len, $offset;  # $FH->syswrite
  $ret     =  sysseek  $FH $pos, $whence;           # $FH->sysseek
   
  $SH      =  open socket $dom, $type, $proto;      # socket->open
  $ret     =  connect $SH $name;                    # $SH->connect
  $ret     =  recv $SH $scalar, $len, $flags;       # $SH->recv
  $ret     =  setsockopt $SH $lev, $opt, $val;      # $SH->setsockopt
  $ret     =  socketshut $SH $how;                  # $SH->socketshut
 ($S1,$S2) =  open socket $dom, $type, $proto, 2;   # (socketpair)

 ($R, $W)  =  pipe;                                 # "open pipe"?

If you read your Camel, most of these changes simply involve dropping the "," after the first argument to take advantage of the indirect object syntax.

This is not pure sugar. It buys us many important benefits:

   1. Fewer functions. As RFC 14 starts to note, gone are the *dir
      functions, as well as many specialized functions. They simply
      become member functions, increasing function namespace. So,
      tell() can now be used on files, directories, and web docs,
      without having to invent new function names.

   2. Seemless integration with extended file types. Using this
      syntax, we can now do this:

         use http;
         $WEB = open http "www.yahoo.com", POST;
         flock $WEB $op;

      And $WEB->flock can die as "unimplemented", without this
      having to be anywhere even remotely in core. It can exist
      as an external module, but to the user it looks like the
      same function used here:

         $FH = open "/etc/motd" or die;
         flock $FH $op;

      Even though it's vastly different.  Users seen a clean,
      coherent interface, without having to worry about the
      nuts and bolts or calling special methods.

   3. More consistent syntax. The most frequently-used IO function,
      print, already uses an indirect object syntax for its handles.
      This RFC simply follows the lead and extends this to other
      IO functions as well.

   4. More modular design. This tightly integrates with the idea
      of moving certain functions out of core, like socket(). Now
      you can simply say:

         use socket;      # import socket class
         $SH = open socket $dom, $type, $proto;
         recv $SH $scalar, $len, $offset;      

      Bingo. It walks and talks like an extensible open() thanks
      to the lovely indirect object syntax, but all the functions
      are actually member functions of $SH. 

   5. Less stuff in core. Hand in hand with the above, stuff like 
      recv() doesn't even have to be in the same zip code as core
      anymore. But it walks and talks just like it was.

      *Plus*, there's less namespace pollution because everything's
      a member function. As such, no extensive checks like "Arg 1
      not a socket handle" have to be done. Running a bad function
      yields a standard error:

         recv $not_a_socket, $bad, $args;
         Can't find object method "recv" via "$not_a_socket" ...

      See below for suggestions on a better error message.

   6. Can extend use of the default filehandle. Thanks to this
      approach, we can actually now use all these methods on the
      default filehandle if we choose to, by implementing simple
      core stubs like the following:

         sub CORE::syswrite { $main::DEFOUT->syswrite(@_) }
         sub CORE::sysread  { $main::DEFOUT->sysread(@_) }
         sub CORE::sysseek  { $main::DEFOUT->sysseek(@_) }

      Plus the core is now far more compact.

   7. The ability to chain function calls, if so desired:

         @f = open("</etc/motd")->readline;

      Python, anyone? :-)

   8. Relatively minor changes. The benefits are very large, but
      most users will simply have to stop typing a "," after the
      first argument. The only function that includes a major
      change is the open() function desribed in RFC 14, but even
      this change is a step towards a more Perlish approach.

A mechanism like this could, conceivably, improve performance, flexibility, and consistency all at the same time if implemented correctly. And that's what Perl 6 is all about, right? ;-)

MIGRATION

As mentioned above, the impact on users is actually relatively small. The translation script would simply have to get rid of the "," after the first argument in the named functions, whose syntax otherwise remains the same as Perl currently overall.

Note that while this RFC suggests getting rid of socket() and just using socket->open, since this is a more extensible framework (to http->open, ftp->open, and so on), this is not strictly required. socket() could remain as a module or even builtin function. The remaining syntax changes suggested above would remain; just rather than socket->open() returning a socket handle, socket() would.

IMPLEMENTATION

Big revisions to existing functions. Most will become object member functions, which is actually a good thing since it means we can just stick them in the modules that support them, and not in the ones that don't.

Increased namespace means we can drop historic C artifacts like "readdir" in favor of a simple "read", since it will be called via the indirect object on directories automatically.

Finally, one thing to consider is more descriptive indirect object error messages. We might consider changing this:

   Can't locate object method "foo" via package "Bar"

To something like this:

   Object method "foo" not implemented by package "Bar"

This is clearer and, IMO, a little more descriptive of what's actually happening, especially when seen in this context:

   flock $WEB $op;

   Object method "flock" not implemented by package "http"

Finally, once RFC 174 gets firmed up a little more, it is quite conceivable that ()'s could remain around these functions, making them look that much more like Perl 5:

   Perl 5                         Perl 6 w/ RFC 174
   -----------------------------  -----------------------------
   sysread(F, $sc, $len, $off)    sysread($F $sc, $len, $off)
   recv(S, $sc, $len, $flags)     recv($S $sc, $len, $flags)

This could, of course, extend to all of the above functions. Getting a comma in after that first arg might be nice too, but it's tricky. See discussions on RFC 174; this syntax is covered there.

REFERENCES

RFC 14: Modify open() to support FileObjects and Extensibility

RFC 174: Improved parsing and flexibility of indirect object syntax

RFC 129: Replace default filehandle/select with $DEFOUT, $DEFERR, $DEFIN