[% setvar title Transaction-enabled variables for Perl6 %]

This file is part of the Perl 6 Archive

Note: these documents may be out of date. Do not use as reference!

To see what is currently happening visit http://www.perl6.org/

TITLE

Transaction-enabled variables for Perl6

VERSION

  Maintainer: Szabó, Balázs <dlux@kapu.hu>
  Date: 17 Aug 2000
  Last Modified: 13 Sep 2000
  Mailing List: perl6-internals@perl.org
  Number: 130
  Version: 6
  Status: Developing

ABSTRACT

Transactions are quite important in a database-enabled application. Professional database systems have transaction-handling inside, but there are only a few laguage out there, what supports transactions in variable level.

In perl6, we can support transaction-enabled variables (including objects and tied variables), and we can control transaction-enabled perl modules with that (this include modules that do external I/O also). We can use our perl program to tie several transaction-enabled data sources, and we can use perl to easily maintain the consistency between them.

In this RFC we will look at how these variables would look like in perl6.

STATUS

The idea and the implementation issuse is mainly clarified: We have now a tidy and simple design.

Some questions are still open, these are the following:

WHAT'S NEW IN VERSION 6

=over4 
  • Added STATUS section
  • Added some Implementation issues
  • Added some more explanation to the ABSTRACT section.
  • DESCRIPTION

    In short, we have "local" keyword, which changes a value of a variable for only the runtime of the current scope. The transaction-enabled variables should start up like "local", but IF the currenct scope reaches at the end, it then copied into the global one.

    We need to get a keyword to mark a variable transaction-enabled. I chosed "trans" in this ducument, but other suggestions are welcome. Possible alternatives are:

      transaction
      transactional
      acid
      atomic
      onsuccess
      consistent

    The final decision will be made by the porters, I use "trans" in this document.

    Preferred syntax:

      sub trans_test { my ($self,@params)=@_;
        trans $self->{value}=$new_value;
      
        # ...
      
        die "Error occured" if $some_error;
      
        function_call(...)
      
        # ...
      
      } ;

    Meaning (in semi perl5 syntax):

      sub trans_test {
        local $self->{value}=$new_value;
      
        # ...
      
        die "Error occured" if $ome_erre;
      
        function_call(...)
      
        # ...
      
        global $self->{value}=$self->{value};
      };

    If we want to gain more control and want to maintain easy syntax, we can use another pragma, which sets up the attributes of the isolation of transaction data. I think the "transaction" pragma could be a good name:

      use transaction (mode => 'lock', timeout=>6000);

    Parameters for "use transaction":

    Two phase commit

    Two phase commit is the common way to deal with distributed transactions. Perl need an interface to objects and tied variables to deal with these to become a reliable transaction-handler. You can choose to implement these features in your object and your tied variable. If you don't do that, perl will give you a rough default.

    At the end of the transaction, 2 different thing can happen: rollback or commit. When rollback occured, all the transaction variables must be rolled back. In commit, a two-phase commit procedure has been started.

    The first phase is preparing to the commit: check the resources, allocates resources to the commit, flushes caches, etc. After that it can decide wheter you can do a commit or not. If all participants send "yes", then the commit phase begins: the coordinator sends "commit" messages to the participants, and the transaction finishes. If any of the participants in the "prepare" phase sends a false value, then the whole transaction need to be rolled back.

    How it looks like in perl?

    You have objects. Objects can be transaction-enabled, and if you want that, you need to define the following functions as callbacks: COMMIT, ROLLBACK, PREPARE, BEGIN_TRANSACTION. If you have a tied variable, then you can define callbacks for this: TIE_COMMIT, TIE_ROLLBACK, TIE_PREPARE, TIE_BEGIN_TRANSACTION. These can be used to extend an object or a tied variable to transaction-safe. If you don't define PREPARE or TIE_PREPARE, then it will be only a one phase commit. If you don't define COMMIT (or TIE_COMMIT) and ROLLBACK (or TIE_ROLLBACK), then perl will do the simple "copy back the old value on rollback" mechanism, which works well in cases when no multithreading and no special handling is necessary for the data. If you don't define BEGIN_TRANSACTION or TIE_BEGIN_TRANSACTION, then no special initialization performed on "trans" call.

    Tie interface

    Adding transaction-enabled property of a tied variable is not straightforward. Imagine you have been tied a hash into a (not transaction-enabled) dbm file. When you fetch, you need to put a shared lock (or version-control) the dbm file or key, when you read, you need to put an exclusive lock, and when the transaction ends, you need to release the lock. For this reason, we can add two callback: TIE_COMMIT and TIE_ROLLBACK.

    If we don't want to use locking, or want to do an advanced transaction-management, we can provide a transaction-id to the callbacks. This can be done with a new package global variable (which is localized in every call), the name can be $Package::TRANSACTION_ID. We could use a new parameter, but it is not is not so neat, because some of the callbacks (PUSH, POP, UNSHIFT, PRINT, PRINTF, etc) are expecting LISTs as an attribute, and this can cause unnecessary rewrite of the tie interface.

    Following is the description of the modifications of the tie interface:

    If a package used in "tie" has one of the above callbacks, then perl _must_ emulate the transaction in every call, so a simple FETCH in non-transaction enironment must be the sequence of TIE_BEGIN_TRANSACTION, FETCH, TIE_PREPARE ? TIE_COMMIT : TIE_ROLLBACK and a simple STORE must be: TIE_BEGIN_TRANSACTION, STORE, TIE_PREPARE ? TIE_COMMIT : TIE_ROLLBACK.

    Object interface

    Object interface is similar to the tied interface: you will need callbacks: PREPARE, COMMIT, ROLLBACK and BEGIN_TRANSACTION. These will do the same as described in the Tie interface. The $Package::TRANSACTION_ID will be set in this case also.

    Note, if you declare an object as "trans", this means that this is localized for the runtime of the transaction and that PREPARE, COMMIT, ROLLBACK will be called at the end of the block of the declaration. It doesn't mean that all the data structure under that is transaction safe. It cannot be guaranteed, and you need to explicitly declare them as "trans" variables.

    TRANSACTION-ENABLED TIED VARIABLE EXAMPLE

    This is an example of a transaction-enabled tie interface.

    The following package can be tied to any variable, and can be used as a persistent, transaction-enabled data.

    Usage:

      tie $scalar, "Transaction::ScalarFile", $filename;
    
      sub my_transaction {
        trans $scalar;
        $scalar="Perl" x 1024;
    
        ...
    
      };

    The data in the file referred by $filename can be accessed, modified as $scalar. $scalar can be used in a transaction, supports subtransactions, and supports two-phase commits and locks the accessed file with flock(), so it can be used in multithreaded and multiprocess environment.

    Here is the code:

      package Transaction::ScalarFile;
      use transaction (mode => 'lock', timeout => 30);
      use strict;
      use Fcntl qw( :flock );
    
      # constant declaration
      sub FILENAME        { 0; };
      sub FILEHANDLE      { 1; };
      sub VALUE           { 2; };
      sub LOCKED          { 3; };
      sub PARENT_TRANS    { 4; };
      sub TEMP_FILENAME   { 5; };
      sub TEMP_FILEHANDLE { 6; };
    
      sub TIE_BEGIN_TRANSACTION { my ($s,$parent)=@_;
        trans $s->[VALUE];  # The value is transaction-enabled
        local $s->[PARENT_TRANS]=$parent;
      };
    
      sub TIESCALAR { my ($class,$filename)=@_;
        my $s=[$filename];
        bless $s,ref($class) || $class;
        $s;
      };
    
      sub FETCH { my ($s)=@_;
        return $s->[VALUE] if defined $s->[VALUE];
        $s->open or return undef;
        flock $s->[FILEHANDLE], LOCK_SH;
        local $/=undef;
        $s->[LOCKED]=1;
        return $s->[VALUE]= < $s->[FILEHANDLE] >;
      };
    
      sub STORE { my ($s,$value)=@_;
        if ($s->[LOCKED]<=1) {
          $s->open or return undef;
          flock $s->[FILEHANDLE], LOCK_EX;
          $s->[LOCKED]=2;
        };
        $s->[VALUE]=$value;
      };
    
      sub TIE_PREPARE { my ($s)=@_;
        # If it is a subtransaction: do nothing
        return 1 if $s->[PARENT_TRANS]; 
        # If the value is not modified: do nothing
        return 1 if $s->[LOCKED]<2;     
        # Transaction failed if I cannot make the temp file
        $s->create_and_lock_temp_file or return 0;
        $!=0; # reset ERRNO
        # Writes the new value to the tempfile
        print $s->[TEMP_FILEHANDLE], $s->[VALUE];
        # Transaction failed if error occured
        unlink($s->[TEMP_FILENAME]),return 0 if $!;
        return 1;
      };
    
      sub TIE_COMMIT { my ($s)=@_;
        return if $s->[PARENT_TRANS];
        # This is the weakest point of the transaction, we cannot make those two
        # operations atomic ...
        rename $s->[FILENAME], $s->[FILENAME].".old.$$";
        rename $s->[TEMP_FILENAME],$s->[FILENAME];
        unlink $s->[FILENAME].".old.$$";
        flock $s->[FILEHANDLE],LOCK_UN;
        flock $s->[TEMP_FILEHANDLE],LOCK_UN;
        $s->[LOCKED]=0;
        $s->[TEMP_FILEHANDLE] = $s->[TEMP_FILENAME] = undef;
      };
    
      sub TIE_ROLLBACK { my ($s)=@_;
        return if $s->[PARENT_TRANS]; # We don't care if it has parent trans.
        flock $s->[TEMP_FIELHANDLE], LOCK_UN;
        flock $s->[FILEHANDLE], LOCK_UN;
        unlink $s->[TEMP_FILENAME];
        $s->[TEMP_FILEHANDLE] = $s->[TEMP_FILENAME] = undef;
      };
    
      sub open { my ($s)=@_;
        return $s->[FILEHANDLE]=new FileHandle("<".$s->[FILENAME]);
      };
    
      sub create_and_lock_temp_file { my ($s)=@_;
        $s->[TEMP_FILENAME]=$s->[FILENAME].".trans.$$";
        $s->[TEMP_FILEHANDLE]=new FileHandle(">".$s->[TEMP_FILENAME) or return 0;
        flock $s->[TEMP_FILEHANDLE],LOCK_EX;
      };

    IMPLEMENTATION

    Transaction handling methods

    Use 'trans' for non-transaction-enabled tied vars and object

    If we use 'trans' keyword for a value which is a tied variable or an object, but which doesn't implement the transaction-interface, our transaction-safe environment is not guaranteed to be consistent anymore. We cannot make the system 100% transaction-safe anymore. All we can do is to emulate the transactional behaviour with our current tools.

    If we take a look at the TIE interface, then we can emulate the transaction behaviour only with STORE and FETCH. It is not a problem with a simple scalar, but it is a problem if we think about a tied array or a tied hash, or simply we throw a fatal exception if someone try this. We must think about it.

    One point is clear: the transaction is as weak as the weakest transaction-enabled variable in it no matter how we emulate the transaction-behaviour.

    CHANGES IN PREVIOUS VERSIONS

    Version 5

    Version 4

    Version 3

    Version 2

    REFERENCES

    PostgreSQL Multi-version concurrency control www.postgresql.org

    Two phase commit: (Google found that :-) oradoc.photo.net

    RFC 19: Rename the local operator

    RFC 119: object neutral error handling via exceptions

    perldoc perlthread: the perl5 threading interface

    perldoc perltie: the perl5 tie interface