[% setvar title Interfaces for linking C objects into perlsubs %]

This file is part of the Perl 6 Archive

Note: these documents may be out of date. Do not use as reference!

To see what is currently happening visit http://www.perl6.org/

TITLE

Interfaces for linking C objects into perlsubs

VERSION

  Maintainer: David Nicol <perl6rfc@davidnicol.com>
  Date: 7 Aug 2000
  Mailing List: perl6-internals@perl.org
  Number: 61
  Version: 2
  Status: Developing

ABSTRACT

XS, the set of C macros and structures, is a considerable barrier to writing extensions of Perl in the C language. While not replacing XS, this document specifies easier ways to integrate a C object file (.o) into a perl program, from the point of view of a perl user wishing to integrate a well defined and well behaved C program into a perl program in a more direct way than a system call.

XS is an excellent medium for sharing the glory of the internals of Perl with the C programming community. It is hoped that the interface deescribed herein will become an excellent medium for sharing the glory of the internals of well-written C libraries with the Perl programming community.

This document is not precisely concerned with the details of the implementation of the interfaces it specifies, beyond a general attempt to restrict itself to the possible.

The use redirect pragma is introduced to allow the external code to only inherit the running programs file handles that the programmer wants it to inherit.

SUMMARY

	use externalh @h_files or die;

	sub callmain{

	   my use redirect STDIN < $InputData;

	   my use redirect STDOUT > PREVIOUSLY_OPENED_HANDLE;

	   $value = use externalc @object_files;

	};

	use externalspace $name_of_the_space;

	tie %cstruct, externalc, $StructDefinition;
	
	%cstruct = unpack( $StructDefinition, $Structure );

	$Structure = pack( $StructDefinition, %cstruct );

DESCRIPTION

This document proposes a set of standard interfaces for linking compiled (or assembled) object code directly into one's Perl program.

Currently, XS is a large enough additional layer, another ocean of knowledge, that it is easier to define one's C code into well behaved utilities and invoke them in backticks than to convert a stand-alone utility into an XS module. Rewriting XS is part of the perl6 mandate. This RFC is an attempt to suggest what one alternative would look like.

default to the main(int argc, char* argv[]) interface in C

C programs accept the knowledge of their run-time command line arguments via two arguments that are passed to the program's main function.

In the absence of other data provided via the externalh method, the @_ array will be converted for use with this linkage.

example of using use externalc

Through use of the externalc use pragma, a set of compiled .o files are linked into the program at compile time, and offered the current scope's @_, converted into scalars, as a count,values pair, if and when the execution path hits that point.

A mechanism must be selected to support gathering the output and input of a C program as well, something like shell redirects. The use redirect is a placeholder for a system to be announced by the group working on file handle object issues.

	sub DatabaseAccess{

		# place to hold the output
		my $tooloutput;	

		# my: redirection will only last until end of scope
			# are not all "use" inside things thusly scoped?
		# no, what if you want a subroutine that redirects?
			# Oh.  Right.  Must require the scoping operator.
		# use redirect: very similar to shell redirection
		my use redirect STDOUT > $tooloutput ;   

		my $retval= use externalc qw(
			/opt/database/accesstool/src/authen.o
			/opt/database/accesstool/src/author.o
			/opt/database/accesstool/src/base64.o
			/opt/database/accesstool/src/choose_authen.o
			/opt/database/accesstool/src/daves.o
			/opt/database/accesstool/src/do_acct.o
			/opt/database/accesstool/src/encrypt.o
			/opt/database/accesstool/src/hash.o
			/opt/database/accesstool/src/hyperauth.o
			/opt/database/accesstool/src/programs.o
			/opt/database/accesstool/src/pw.o
			/opt/database/accesstool/src/pwlib.o
			/opt/database/accesstool/src/smbencrypt.o
			/opt/database/accesstool/src/smbutil.o
		);

		# with "main" linkage we have "0 is success" semantics
		$retval and die "database access tool failed. Erno: $retval";

		$tooloutput;

	};

why not use backticks?

Current practice in this kind of situation is to fully compile the C program, then write something like this:

	sub DatabaseAccess{
		`/opt/database/accesstool @_`;
	};

There is nothing wrong with this, and it will continue to work reliably.

Finer control over importing C functions is provided using the use external function pragma

Instead of importing whole programs, we would like to be able to access arbitrary functions inside arbitrary pieces of external code, and let perl keep track of a lot of the housekeeping involved in doing that with a C program.

the use external mainroutine pragma

use external* can offer more flexibility than merely extending perl virtual forking to programs which ordinarily would run in an operating system defined process space. (that is, instead of calling system).

The name of the main routine could be changed to something other than main through manipulation of the use external mainroutine pragma,

	use externalh qw(
		/opt/database/accesstool/src/accesstool.h
	);
	use external mainroutine pwverify;
	use externalc;

Since we have not changed the external name space, externalc will be linking with the same pile of .o files from last time.

the use externalh pragma

The use externalh pragma reads in C header files, with the function definitions used here, so that externalc can perform reasonable transformations between the perl data in @_ and the arguments specified in them.

externalc tie classnames

Data structures in C, and other languages that use offsets into fixed-size records to differentiate size-limited fields (which are, of course, vulnerable to being overrun) are defined with language-specific data structure definition notations.

Using the perl5 tie operator it would not be difficult to create a class which would allow access into a particular structure, using substr and unpack to extract the fields, and returning undef or even dying (tie-time option) when attempt is made to access a field which is not named in the structure definition.

This process could even be automated, and a C-Header-File-to-perl-tiable-class-definition package would be right at home on the CPAN scripts listing.

	tie %cstruct, externalc, $StructDefinition;

would then tie %cstruct to the structure definition described in $StructDefinition. This could be, in the same way of regular expressions, either a text string or a pre-parsed structure definition in later releases of perl6.

After being so tied, %cstruct can accept value assignment from a scalar holding a defined structure and will have its fields instantly filled out:

	# New in perl 6 -- by popular demand, fixed length records!
	$FixedInputRecordLength = length($StructDefinition);

	while(%cstruct = <>){	# new in perl 6: "tie" traps assignment to hash
		$Totalpoints += $cstruct{points}
	};

Perl6 variables will have type and range checking built into them, and a tied external variable will inherit the limitations of the context it is being imported from, something you don't get with a Perl5 unpack.

more fun with structure definitions

If we accept structure definitions into our language, as something a little stronger than pack templates, these two snippets of code make sense:

	%cstruct = unpack( $StructDefinition, $Structure );

	$Structure = pack( $StructDefinition, %cstruct );

The information carried in a structure definition, regardless of what language it is imported from, will include:

Like other aspects of perl, as much as possible will be determined and optimised away at compile time, while general constructs that can handle arbitrary inputs will remain for run time binding.

type checking

So, a compile-time compatability table could be maintained, minimizing the amount of run-type type checking that must be done. A paranoid mode would have to be maintained for those who are worried about the ultrasneaky violating type check rules.

Instead of package space, external spaces use externalnamespace

It may be possible to extend the lines around a perl5 package to work for insulating different external packages being imported into the same perl program, so that each package can only have one importable package.

Or, external name spaces can be created with use externalspace which operates analogously to package in perl5.

Since the wrapping of external functions into perlsubs will be done in Perl, keeping one external name space per package forces all access functions for a set of object files to share the same module.

other languages

Adding additional external* pragmata to support other sorts of objects, such as system libraries, will happen.

IMPLEMENTATION

getting this to work

Getting this to work will require knowledge of how linkers link. I believe that all platforms support compilation in parts, and the perl program is itself compiled in parts, so this extension, although challenging to describe, may be trivial to implement, or may require perl carrying around its own portable C library, which it pretty much does already.

The automatic translation between Perl scalar data and C data structures will be necessarily partial. Advanced conversions may be done on the perl side with some to-be-determined packing functions, for instance.

Compatibility with legacy code

There is enough room in the perl5 syntaces for including outside code that no new words will need to be reserved, not counting the use external* pragmata as reserved words.

The stuff about structure definitions and %TiedHash = $BinaryStructure would not be possible to emulate in perl5 as I understand it.

Backwards Compatibility

It is very possible that most of the extensions can be implemented with simple wrappers in XS, for integrating the new use pragmata into perl5. And we keep XS, too, for legacy modules and use by the courageous.

New Directions

With threads and the new internal fork (whatever it's called) perl seems to be subsuming more and more of the operating system. By bringing the linker inside perl, we may be getting closer to radical compatibility modes in which platform independence could be offered by setting up portable emulation environments, so that instead of BSD having a working Linux emulation mode, perl could have a working linux emulation mode which would run anywhere (as long as the program running in the environment did not try to do something outside the realm of what perl can interact with) or perl might aid projects such as WINE in new ways.

REFERENCES

/usr/lib/perl5/5.00503/pod/perlxs.pod was looked at to reverify that XS is frighteningly complex.