Friday, May 22, 2015

First Steps with AnyEvent

Nearly 10 years ago, while I was at university, I was first introduced to event-driven programming with Twisted, an event-driven framework for Python. At the time, I was thoroughly confused by the whole thing and didn't really get it (Not allowed to block? But I want my user input now, dammit!). I persisted for a while, but ultimately gave up.

The second time around it made a lot more sense and, since then, I've been exposed to libev (via the EV Perl module) and POE. It's a powerful programming paradigm, and super worthy of being a part of any programmer's toolbox. And this is where AnyEvent can be of service.

AnyEvent isn't exactly an event framework itself, but it can sit on top of other event frameworks; the self-described "DBI of event loop programming". So, because we have a ton of legacy EV code at work, I can write AnyEvent code alongside it, and AnyEvent will integrate with EV's event loop and everything will just work. This a huge benefit and an obvious reason why adopting AnyEvent into an existing system can be a good idea.

Experimenting at Home

I wrote some code to hit my wireless internet router's web server a bunch of times and it wasn't long before I came across my first issue with AnyEvent::HTTP; my router (by Huawei, so take what you will from that) only accepts the content length header in exactly the form 'Content-Length'. However, AnyEvent::HTTP issues 'Content-length' (note lowercased 'L') and my router issues a 404 response for this. Diving into the AnyEvent::HTTP source, I found that there's no way to override this behaviour, as AnyEvent::HTTP lowercases all headers you pass in (plus a couple of default ones) and then ucfirst's them on output.

So what was my workaround for this? To use AnyEvent::Handle and HTTP::Request to talk HTTP. I could have used AnyEvent::Socket alongside AnyEvent::Handle, but AnyEvent::Handle can make TCP connections aswell, which is handy.

That worked! Making many many many HTTP requests? AnyEvent handles it like a champ. That's great!

But the server didn't handle the number of connections very well, so there were lots of failed and closed connections. That's bad.

My first instinct was to write code to retry a connection if it failed for any reason. This worked for a while, but eventually EV (the event backend that AnyEvent had chosen to use, since it was installed) would die with a critical error. That's also bad.

In the end, the solution was to throttle the number of connections I was making, and everything started working again. Great!

How do you throttle? Something like this:


use AnyEvent;
use AnyEvent::HTTP;

my $cv = AE::cv;
my $num_connections = 0;

my $w; $w = AE::io \*DATA, 0, sub {
    return if $num_connections >= 2;

    my $url = <DATA>;

    unless (defined $url) {
        AE::log info => "Finished reading DATA";
        undef $w;
        return;
    }

    chomp $url;
    AE::log info => "Trying $url";

    $num_connections += 1;
    $cv->begin;

    my $hw; $hw = http_head $url, sub {
        my ($data, $headers) = @_;
        AE::log info => "Result from $url: $headers->{Status}";

        $num_connections -= 1;
        $cv->end;
        undef $hw;
    };

};

$cv->recv;

__DATA__
http://google.com
http://reddit.com
http://facebook.com
http://twitter.com
http://gmail.com

Closing Thoughts

I've been wanting to use AnyEvent for something work-related recently, but I want to experiment a little more before I go down that rabbit hole, or perhaps just use it in something non-critical.

Condition variables are great!

I've heard that IO::Async (with futures) is another great way to go as far as Perl asynchronous programming frameworks go, so I might experiment with that some time soon, but for now I'll stick with AnyEvent.