[% setvar title Interfaces for linking C objects into perlsubs %]
Note: these documents may be out of date. Do not use as reference! |
To see what is currently happening visit http://www.perl6.org/
Interfaces for linking C objects into perlsubs
Maintainer: David Nicol <perl6rfc@davidnicol.com> Date: 7 Aug 2000 Mailing List: perl6-internals@perl.org Number: 61 Version: 2 Status: Developing
XS
, the set of C macros and structures, is a considerable
barrier to writing extensions of Perl in the C language. While
not replacing XS, this document specifies easier ways to
integrate a C object file (.o) into a perl program, from the
point of view of a perl user wishing to integrate a well defined
and well behaved C program into a perl program in a more direct
way than a system
call.
XS
is an excellent medium for sharing the glory of the internals
of Perl with the C programming community. It is hoped that the
interface deescribed herein will become an excellent medium for
sharing the glory of the internals of well-written C libraries
with the Perl programming community.
This document is not precisely concerned with the details of the implementation of the interfaces it specifies, beyond a general attempt to restrict itself to the possible.
The use redirect
pragma is introduced to allow the external code
to only inherit the running programs file handles that the programmer
wants it to inherit.
use externalh @h_files or die; sub callmain{ my use redirect STDIN < $InputData; my use redirect STDOUT > PREVIOUSLY_OPENED_HANDLE; $value = use externalc @object_files; }; use externalspace $name_of_the_space; tie %cstruct, externalc, $StructDefinition; %cstruct = unpack( $StructDefinition, $Structure ); $Structure = pack( $StructDefinition, %cstruct );
This document proposes a set of standard interfaces for linking compiled (or assembled) object code directly into one's Perl program.
Currently, XS is a large enough additional layer, another ocean of knowledge, that it is easier to define one's C code into well behaved utilities and invoke them in backticks than to convert a stand-alone utility into an XS module. Rewriting XS is part of the perl6 mandate. This RFC is an attempt to suggest what one alternative would look like.
main(int argc, char* argv[])
interface in CC programs accept the knowledge of their run-time command line arguments
via two arguments that are passed to the program's main
function.
In the absence of other data provided via the externalh
method,
the @_
array will be converted for use with this linkage.
use externalc
Through use of the externalc
use pragma, a set of compiled .o files
are linked into the program at compile time, and offered
the current scope's @_
, converted into scalars, as a count,values pair,
if and when the execution path hits that point.
A mechanism must be selected to support gathering the output and
input of a C program as well, something like shell redirects. The
use redirect
is a placeholder for a system to be announced by the
group working on file handle object issues.
sub DatabaseAccess{ # place to hold the output my $tooloutput; # my: redirection will only last until end of scope # are not all "use" inside things thusly scoped? # no, what if you want a subroutine that redirects? # Oh. Right. Must require the scoping operator. # use redirect: very similar to shell redirection my use redirect STDOUT > $tooloutput ; my $retval= use externalc qw( /opt/database/accesstool/src/authen.o /opt/database/accesstool/src/author.o /opt/database/accesstool/src/base64.o /opt/database/accesstool/src/choose_authen.o /opt/database/accesstool/src/daves.o /opt/database/accesstool/src/do_acct.o /opt/database/accesstool/src/encrypt.o /opt/database/accesstool/src/hash.o /opt/database/accesstool/src/hyperauth.o /opt/database/accesstool/src/programs.o /opt/database/accesstool/src/pw.o /opt/database/accesstool/src/pwlib.o /opt/database/accesstool/src/smbencrypt.o /opt/database/accesstool/src/smbutil.o ); # with "main" linkage we have "0 is success" semantics $retval and die "database access tool failed. Erno: $retval"; $tooloutput; };
Current practice in this kind of situation is to fully compile the C program, then write something like this:
sub DatabaseAccess{ `/opt/database/accesstool @_`; };
There is nothing wrong with this, and it will continue to work reliably.
use external function
pragmaInstead of importing whole programs, we would like to be able to access arbitrary functions inside arbitrary pieces of external code, and let perl keep track of a lot of the housekeeping involved in doing that with a C program.
use external mainroutine
pragmause external*
can offer more flexibility than merely
extending perl virtual forking to programs which ordinarily would
run in an operating system defined process space. (that is, instead
of calling system
).
The name of the
main routine could be changed to something other than main
through manipulation of the use external mainroutine
pragma,
use externalh qw( /opt/database/accesstool/src/accesstool.h ); use external mainroutine pwverify; use externalc;
Since we have not changed the external name space, externalc
will be linking
with the same pile of .o files from last time.
use externalh
pragmaThe use externalh
pragma reads in C header files, with the function
definitions used here, so that externalc
can perform reasonable
transformations between the perl data in @_
and the arguments specified
in them.
externalc
tie classnamesData structures in C, and other languages that use offsets into fixed-size records to differentiate size-limited fields (which are, of course, vulnerable to being overrun) are defined with language-specific data structure definition notations.
Using the perl5 tie
operator it would not be difficult to create
a class which would allow access into a particular structure, using
substr
and unpack
to extract the fields, and returning undef or
even dying (tie-time option) when attempt is made to access a field
which is not named in the structure definition.
This process could even be automated, and a C-Header-File-to-perl-tiable-class-definition package would be right at home on the CPAN scripts listing.
tie %cstruct, externalc, $StructDefinition;
would then tie %cstruct to the structure definition described
in $StructDefinition
. This could be, in the same way of
regular expressions, either a text string or a pre-parsed structure
definition in later releases of perl6.
After being so tied, %cstruct can accept value assignment from a scalar holding a defined structure and will have its fields instantly filled out:
# New in perl 6 -- by popular demand, fixed length records! $FixedInputRecordLength = length($StructDefinition); while(%cstruct = <>){ # new in perl 6: "tie" traps assignment to hash $Totalpoints += $cstruct{points} };
Perl6 variables will have type and range checking built into them, and
a tied external variable will inherit the limitations of the context it
is being imported from, something you don't get with a Perl5 unpack
.
If we accept structure definitions into our language, as something
a little stronger than pack
templates, these two snippets of
code make sense:
%cstruct = unpack( $StructDefinition, $Structure ); $Structure = pack( $StructDefinition, %cstruct );
The information carried in a structure definition, regardless of what language it is imported from, will include:
Like other aspects of perl, as much as possible will be determined and optimised away at compile time, while general constructs that can handle arbitrary inputs will remain for run time binding.
So, a compile-time compatability table could be maintained, minimizing the amount of run-type type checking that must be done. A paranoid mode would have to be maintained for those who are worried about the ultrasneaky violating type check rules.
use externalnamespace
It may be possible to extend the lines around a perl5 package to work for insulating different external packages being imported into the same perl program, so that each package can only have one importable package.
Or, external name spaces can be created with use externalspace
which operates analogously to package
in perl5.
Since the wrapping of external functions into perlsubs will be done in Perl, keeping one external name space per package forces all access functions for a set of object files to share the same module.
Adding additional external* pragmata to support other sorts of objects, such as system libraries, will happen.
Getting this to work will require knowledge of how linkers link. I believe that all platforms support compilation in parts, and the perl program is itself compiled in parts, so this extension, although challenging to describe, may be trivial to implement, or may require perl carrying around its own portable C library, which it pretty much does already.
The automatic translation between Perl scalar data and C data structures will be necessarily partial. Advanced conversions may be done on the perl side with some to-be-determined packing functions, for instance.
There is enough room in the perl5 syntaces
for including outside code that no new words will need to be
reserved, not counting the use external*
pragmata as reserved words.
The stuff about structure definitions and %TiedHash = $BinaryStructure
would not be possible to emulate in perl5 as I understand it.
It is very possible that most of the extensions can be implemented with simple wrappers in XS, for integrating the new use pragmata into perl5. And we keep XS, too, for legacy modules and use by the courageous.
With threads and the new internal fork (whatever it's called) perl seems to be subsuming more and more of the operating system. By bringing the linker inside perl, we may be getting closer to radical compatibility modes in which platform independence could be offered by setting up portable emulation environments, so that instead of BSD having a working Linux emulation mode, perl could have a working linux emulation mode which would run anywhere (as long as the program running in the environment did not try to do something outside the realm of what perl can interact with) or perl might aid projects such as WINE in new ways.
/usr/lib/perl5/5.00503/pod/perlxs.pod was looked at to reverify that XS is frighteningly complex.