Friday, May 13, 2016

Perl/XS Hello World

The number one thing in Perl I've always found confusing is writing an XS extension. I don't write them very often, but when I do, I completely forget how to get started and I end up copying and pasting something I wrote for a previous project, and then I've got a bunch of extra files that I'm not sure I need, and if it turns out that I do need them, I'm not even sure what they're for. So I'm writing this as much as a future reference for myself and as something to help others.

For a bare-bones "hello world" XS extension, we'll need four files:

  • HelloWorld.xs (contains the XS code)
  • lib/HelloWorld.pm (the package, which ends up being the glue between the driver script and the XS code)
  • Makefile.PL (to build the module)
  • bin/driver.pl (a test driver script)

An older method for generating these files (and more) was to use the h2xs utility. I prefer to not use h2xs if possible, purely because it generates a lot more cruft than we need at the moment and also because doing it without h2xs means we know precisely what files we're creating and, more importantly, why. Having said that, later on, testing the various h2xs options can help solve problems in our XS stuff, if we get stuck and can't find any documentation for our problem.

HelloWorld.xs will contain one function (referred to as an XSUB) that simply prints some text to stdout.


#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"

#include <stdio.h>

MODULE = HelloWorld    PACKAGE = HelloWorld::handle

void
hello()
  CODE:
    printf("Hello, world!\n");

In this simple example, we're exposing one XSUB, hello(), to the Perl world, which will be available via the HelloWorld::handle package.

The resulting C code can be generated by running xsubpp over the file. Running xsubpp will generate a ton of code that won't make sense, but it can be interesting to see just how much code is generated for such a simple module.

Looking at the code, it looks like C with some extra stuff tacked on. That extra stuff is the XS stuff. Any code that precedes the MODULE directive is purely C code. In this top section, we can write whatever C functions we want, and they can be referenced below in the "XS stuff". It's important to realise that any C functions you write at the top of the file are not automatically exposed as XSUBs. To do that, you'd have to write a corresponding XSUB further down (and there's some nice shorthand for that).

A common question at this point is "what's the difference between MODULE and PACKAGE?" A MODULE is a way to group multiple XS extensions together under different PACKAGE names. For example, we may write a ton of HTTP XS libraries under the MODULE HTTP::XS but split up code into a PACKAGE named HTTP::XS::HTTP1_0 and another named HTTP::XS::HTTP1_1 and some other packages to deal with TLS, proxies, authentication, etc...

The name of your Perl package doesn't need to be the same as the name of your XS module either, so, if we really wanted to, we could have the Perl package FooBar, in lib/FooBar.pm, load the HelloWorld.xs extension.

Moving on, now HelloWorld.pm needs to tell Perl how to load the extension.


package HelloWorld;

use warnings;
use strict;

our $VERSION = '0.01';

require XSLoader;
XSLoader::load('HelloWorld');

sub say_hello {
  my ($self) = @_;
  HelloWorld::handle::hello();
}

1;

As per the XSLoader docs, XSLoader is a simplified version of DynaLoader. Use XSLoader. XSLoader works well.

The say_hello() function wraps the hello() XSUB from our XS module. The benefit of adding this extra layer (as opposed to having the client code directly call our XSUB), is so the module developer (us) can add something extra (like checking argument values/types with a type system like Type::Tiny) without changing the interface to the XS extension and without adding any unnecessary complexity to the XSUB.

The next step is to build the extension. We use Makefile.PL for this (or Build.PL if you prefer Module::Build).


use 5.008009;
use ExtUtils::MakeMaker;

WriteMakefile(
  NAME         => "HelloWorld",
  VERSION_FROM => "lib/HelloWorld.pm",
);

There's nothing magical going on; it's a pretty stock-standard Makefile.PL. If we wanted to reference any external libraries or if we wanted to use g++, llvm or clang to build our extension, the docs give a few hints how to do that.

We now have enough pieces in place to build the module.


$ perl Makefile.PL
Generating a Unix-style Makefile
Writing Makefile for HelloWorld
Writing MYMETA.yml and MYMETA.json
$ make
cp lib/HelloWorld.pm blib/lib/HelloWorld.pm

 ... removed for brevity ...

At this point a ton of extra files have been generated, in particular the blib directory. This is the "build library" directory and it's the staging area for everything that is to be tested and, finally, installed onto the machine.

Since we don't want to install the module yet, and we just want the driver script to use what's in the blib directory, the driver scripts needs to make sure it retrieves its definition of the HelloWorld package from this directory and not a version that may already be pre-installed on the machine...


#!/usr/bin/env perl

use warnings;
use strict;

use ExtUtils::testlib;
use HelloWorld;

HelloWorld->say_hello();

... and that's what ExtUtils::testlib handles for us, by manipulating @INC to include the blib directory. Once our module is installed, using ExtUtils::testlib would be unnecessary. Apart from that, the driver script is insanely simple.


$ perl bin/driver.pl
Hello, world!

Huzzah!

So what's next? What if I want to write my XSUBs in C++? What if I want to interface with some other C/C++ library? How do I return a list or a hash from my XSUB? How do I pass a list or a hash into my XSUB? That all kinda goes beyond the scope of this post, and I may follow up with another post eventually, but until then here's some useful links:

  1. XS Fun. Sawyer X's XS tutorial. I actually found this after I'd pretty much finished writing this blog post. Definitely the most useful resource to read next before really getting into the perldocs.
  2. perlxs. Includes documentation on all of the XS keywords.
  3. perlguts. The section on variables is very useful.
  4. perlapi. The Perl API. Contains a bunch of functions and macros.