Friday, May 4, 2018

tree

Cherie and I recently moved into our first home, and outside of the excitement of buying furniture, fixing broken stuff, unpacking boxes, and the rest of the mountain of work that goes into moving houses, I found myself on a Monday afternoon with a spare couple of hours to work on my NES emulator. But I ran into a small problem; on my personal MacBook, I don't have the tree command installed, and I didn't have an internet connection at the new house yet. So I decided to write it.

https://github.com/ltriant/tree

It was a fun way to burn a couple of hours, I got to have some fun writing some C for the first time in a while, and I was reintroduced to writing makefiles (I forgot about all the handy implicit rules). It's incredibly simple code for an incredibly simple problem, and it's not really portable at all, but it works on my MacBook.

Friday, April 27, 2018

Perl for DevOps: Mojo::UserAgent

There's no way anyone "doing devops" can work without needing to interact with products/systems remotely, and these days pretty much everything offers an API and that API is more than likely available over HTTP. The Mojolicious distribution provides a ton of useful modules for doing almost anything HTTP-related; both client- and server-side.

This post is going to mostly focus on writing client code with Mojo::UserAgent, and some utility modules that Mojolicious ships with. I will also touch on using Mojolicious::Lite for writing web interfaces and services, but deployment will be deferred until the next post, which will focus on Plack/PSGI, the various Plack servers available, and some of the web frameworks that support PSGI.

Mojo::UserAgent vs LWP::UserAgent

For the longest time, LWP::UserAgent has been the de-facto standard HTTP client library that the Perl community uses. But I've written before about why I prefer Mojo::UserAgent over LWP::UserAgent; it has both blocking and non-blocking interfaces, so you can write both kinds of applications and only need to know one API, should you get into writing non-blocking code in your devops journey.

But there's more.

Prototype with mojo

The mojo command is provided by the Mojolicious package, and is a great tool to query web services and pages. I wouldn't call it a curl or wget replacement (that's not its purpose), but it's a great tool for extracting data that is embedded in a response, for use in shell scripts, and for rapid prototyping before converting to Mojo::UserAgent in a Perl script.

As an example, if you wanted to use the MetaCPAN API to get the latest version of a package:


$ mojo get http://fastapi.metacpan.org/v1/package/Mojolicious
{
   "version" : "7.74",
   "module_name" : "Mojolicious",
   "dist_version" : "7.74",
   "file" : "S/SR/SRI/Mojolicious-7.74.tar.gz",
   "distribution" : "Mojolicious",
   "author" : "SRI"
}

This shows the JSON output of the API request. Mojolicious comes bundled with its own JSON decoding/encoding module, Mojo::JSON, which can be used directly in any application you want - perhaps as a replacement for the other JSON modules, if you so desired - but it's also integrated into Mojo::UserAgent, for decoding and easily extracting data from JSON responses.

The version is what I'm after. We can easily grab that with another parameter, utilising some simple notation.


$ mojo get http://fastapi.metacpan.org/v1/package/Mojolicious /version
7.74

But what if there is no nice JSON API, and we have to extract the same data from a web page? Well, we can do that too:


$ mojo get https://metacpan.org/pod/Mojolicious 'span[itemprop="softwareVersion"]' text
7.74

This fetches the Mojolicious documentation on MetaCPAN and looks for a span tag with an itemprop attribute value of softwareVersion and displays the text in the tag. In this case, we're pretty lucky that the MetaCPAN page gives us a friendly way to locate this data, but more complex queries can be used for less mojo-friendly websites.

The beauty of the mojo tool is that once you've prototyped how to extract the information that you want, you can either leave it in a bash script, or you can port the code to use the Mojo::UserAgent module and use it as part of a larger application.


#!/usr/bin/env perl

use v5.10;
use warnings;
use strict;

use Mojo::UserAgent;

my $ua = Mojo::UserAgent->new;
my $tx = $ua->get('http://fastapi.metacpan.org/v1/package/Mojolicious');

if ($tx->res->is_success) {
 say $tx->res->json->{version};
}

This is just a part of what the Mojolicious distribution has to offer; there's also an event loop and a promises implementation for writing non-blocking client and server code, and a whole lot more. Mojolicious wants to be a self-contained installation with as few external dependencies as possible, which makes it stable, and resilient to issues in the greater CPAN package ecosystem. Check out the other packages it provides.

Next in the series (whenever I get to it) I'll go through Mojolicious::Lite and Plack/PSGI, for when the time comes to write and deploy web sites and services.

Friday, March 9, 2018

NES Emulator, Part 1: I have no idea what I'm doing

For a very long time I've wanted to have a go at writing an emulator, and for one reason or another I never did it, but a few weeks ago I decided to pull the trigger and start writing a NES emulator. And I've chosen to write it in Rust, as I have a goal this year to achieve a moderate-to-high level of competency in the language.

This is the first of a few blog posts to briefly document my journey as someone who has never written an emulator before.

I have very limited technical knowledge in this space, so maybe this will be useful to someone in the future.

The Beginning

The NesDev wiki is full of useful information, and is easily the most useful resource I've found so far.

On the wiki there are a ton of links and pages describing the basic architecture of the NES; the CPU is a 6502 processor, and there's a PPU for rendering the graphics, and an APU for handling the audio. As far as I can tell (and I'll correct this in future posts if I need to), the CPU simply writes to certain memory addresses that the PPU and APU then read from to do stuff. And in a single emulator cycle, based on the clock rates of each component, X number of CPU cycles are executed, Y number of PPU cycles are executed, and Z number of APU cycles are executed.

The first place I decided to dive in, after reading various threads on the EmuDev subreddit, was with the CPU implementation. I have zero experience with the 6502 beyond reading about the early days of Apple and Steve Wozniak's stories from the Homebrew Computer Club, but it's a thoroughly documented processor and the NesDev wiki has plenty of resources for it. It's a pretty basic CPU to understand; there are three standard registers, a few status flags, a program counter, and a stack pointer. Nothing new if you've ever written any assembly, or had to debug stuff in gdb before.

Initially, I started from the bottom up, modelling the memory map and writing the CPU's interactions with the memory. However, because of the different things that the memory maps to (controllers, joysticks, the game ROM data, etc.), I realised that I'd have to write a mountain of code before I'd even know if the design I was rolling with would even work, something that can be fairly unforgiving to redo in a Rust project because of how strict the compiler is. So I changed track a little by going top down instead, and started writing a very simple and incomplete parser for the iNES file format, which is the format that most ROMs are available in. There's a wiki page for that too.

I then grabbed the nestest ROM from the emulator test page, and starting implementing new instructions and addressing modes every time I hit something my emulator didn't know how to do.

The disassembly output that my emulator prints isn't exactly the same as what the nestest log output is, and I'm not sure how worried I should be about that yet. Most posts that I find on the NesDev forums suggest that being mostly correct is good enough at the start, and to just use it as a guide. But it makes me feel all kinds of uncomfortable.

At this point in time, I'm still implementing instructions for the CPU.

It's alive! (sort of)

Addressing Modes

Addressing modes (which dictate how an opcode/instruction gets its operands/arguments) confused the hell out of me at first, but I understand it a lot better after following through the addressing modes page on the NES Hacker wiki.

Learn from Others

The last thing that has been super useful is reading the source of other emulators, and any blog posts people may have written.

Michael Fogleman's article about writing a NES emulator was great source of lower-level information to start with, and the code is super easy to follow, even if you're not overly familiar with Go.

This Rust emulator has also been useful, when I'm trying to figure out how best to model certain things in Rust.

Friday, February 9, 2018

Project Euler problem 67

I've used Project Euler a number of times in the past as a learning platform; whether it's to keep some skills sharp or to learn something new. A few months ago, I revisited some old problems when I was learning the basics of Rust.

Problem 67 is simple as its core; in a pyramid of numbers, find the maximum path sum of numbers from the top to the bottom. The example pyramid is:

      3
    7   4
  2   4   6
8   5   9   3

The earlier problem (problem 18) is a much simpler version of the problem, where the input is small and allows for very naive recursive solution:

fn max_sum_recursion_dumb(v: &Vec<Vec<u32>>, i: usize, j: usize) -> u32 {
    if i >= v.len() {
        return 0;
    }

    let left = max_sum_recursion_dumb(&v, i+1, j);
    let right = max_sum_recursion_dumb(&v, i+1, j+1);

    v[i][j] + left.max(right)
}

The basic idea is that to get the total sum of the current node, get the total sum of the left child node and the total sum of the right child node, and return the sum that is larger, with the current node's value added to it.

The function can be called easily, assuming the pyramid is represented as a two-dimensional vector and we start at the top (zero indexes):

println!("{}", max_sum_recursion_dumb(&input_data, 0, 0));

The problem with this approach is there is a ton of re-computation happening, which only gets worse as the input gets larger (as in problem 67).

If sticking with this solution, the technique to get around recomputing for the same input over and over again is to memoize (or cache) results of previous calls. This results in a top-down dynamic programming solution:

fn max_sum_recursion(v: &Vec<Vec<u32>>,
                     i: usize,
                     j: usize,
                     mut cache: &mut HashMap<(usize, usize), u32>) -> u32 {

    if i >= v.len() {
        return 0;
    }

    if let Some(rv) = cache.get(&(i,j)) {
        return *rv;
    }

    let left = max_sum_recursion(&v, i+1, j, &mut cache);
    let right = max_sum_recursion(&v, i+1, j+1, &mut cache);

    let rv = v[i][j] + left.max(right);
    cache.insert((i,j), rv);
    return rv;
}

This can then be called with an extra parameter for the cache:

let mut cache = HashMap::new();
println!("{}", max_sum_recursion(&input_data, 0, 0, &mut cache));

But to be honest, although the solution worked well, I don't like the look of the code at all. In the original Perl version of this solution that I'd written years ago, I just used the Memoize module, so the code remained very much like the first function, but with an extra line outside of the function to enable memoization.

But in this Rust version of the function, I have to pass around mutable references, which I don't want to do unless absolutely necessary, and I didn't feel like it was absolutely necessary.

Looking at the core of how the function finds the maximum sum - once you unravel the recursion - it actually starts at the bottom of the pyramid, compares pairs of numbers for the larger number, adds it to the parent number on the row above, and repeats, until we get to the top. So what if, instead of starting at the top, the code just started from the bottom and bubbled up to the top?

Using the example in the problem description, the pyramid starts like:

      3
    7   4
  2   4   6
8   5   9   3

Then we take the bottom row (8, 5, 9, 3), get the largest number of each pair, and add it to the number of the parent node in the row above. So, after this step, we end up with the following pyramid:

      3
    7   4
  10  13  15
8   5   9   3

And then we repeat for the row above:

      3
    20  19
  10  13  15
8   5   9   3

And then once more for the top row:

      23
    20  19
  10  13  15
8   5   9   3

Finally, the top of the pyramid contains the maximum path sum, 23.

fn max_sum_norecursion(v: &Vec<Vec<u32>>) -> u32 {
    let mut sums = v[v.len() - 1].clone();

    for i in (1 .. v.len()).rev() {
        for j in 1 .. v[i].len() {
            let max = sums[j].max(sums[j-1]);
            sums[j-1] = max + v[i-1][j-1];
        }
        sums.pop();
    }

    sums[0]
}

This bottom-up dynamic programming approach has resulted in - I think - better looking Rust code, without the need for recursion, and without the need to hand around a mutable reference for caching.

It was fun to come back and solve this problem in a different way, even if the original reason I even revisited it was just to learn some basic Rust.

Friday, January 19, 2018

Perl for DevOps: IO::All

A stupidly common task to perform is file and directory IO. In the Perl world, the IO::All module has wrapped up nearly every common IO task I can think of into a very expressive interface. With the many options available to perform the same task, it can fit into scripts in many different ways.

For example - as a sort of "hello world" of file IO - if I wanted to read the contents of a file, do some basic processing and then output to a new file, here is a very simple solution:

use IO::All;

my $contents < io("foo.txt");
$contents =~ s{foo}{bar}g;
$contents > io("bar.txt");

Or, if you're not a fan of operator overloading, that's cool too! Here's the same script, with a more explicit usage:

use IO::All;

my $contents = io("foo.txt")->slurp;
$contents =~ s{foo}{bar}g;
io("bar.txt")->print($contents);

And there are a bunch more options to do similar things in the documentation.

What about reading a file backwards? This is sometimes useful to look for the last instance of an event in a log file:

use v5.10;
use IO::All;

my $io = io("/var/log/maillog");
$io->backwards;
while (my $line = $io->getline) {
  if ($line =~ m{ dsn = 4\.5\.0 }xms) {
    say "last success: $line";
    last;
  }
}

What About Directories?

Perhaps we wanted to traverse /var/log recursively and list out anything that's gzip compressed:

use v5.10;
use IO::All;

my @files = io('/var/log')->deep->all_files;
foreach my $file (grep { $_->ext eq 'gz' } @files) {
  say $file->name;
}

Something I've had to do on more than one occasion - when bringing up and initialising a new VM - is create a directory structure and all of the parent directories with it:

use IO::All;

foreach my $a ('0' .. '9', 'a' .. 'f') {
  foreach my $b ('0' .. '9', 'a' .. 'f')
    io->dir("/var/my-application/tmp/$a/$b")->mkpath;
  }
}

So What?

The tendency is to just use bash scripts for a lot of these tasks. But bash scripts become unwieldy when the scope of a tiny script creeps, and it now needs to compress files, encrypt data, maybe upload stuff to S3, logging everything it does along the way to a centralised location, perhaps logging all errors to a Slack channel, or maybe just sending a notification to a Slack channel when the job is done. Perl is more than ready to handle those tasks.

I'll tackle some of these modules and tasks in more detail in future posts.

Worthy Mention: Path::Tiny

Although more can be accomplished with IO::All, the Path::Tiny module is also worth knowing about. There have been certain times where I've needed more specific control that IO::All doesn't provide. In those cases, Path::Tiny usually does what I want, so it's a handy backup tool and worth knowing about.

Between these two modules, pretty much all filesystem IO needs should be taken care of.

So Much More

I'd encourage anyone to look through the docs. There are tons of examples for all kinds of tasks that I haven't touched on at all in this post, even as far as being able to send emails via a plugin.

Unless - or until - you need to use the low-level IO functions for fine-grained control and/or better performance for a critical piece of functionality, the IO::All module (and Path::Tiny as its companion) should be more than enough almost all of the time.

Friday, January 12, 2018

Handy Skills: Start a Campfire

It's been a while since I've done one of these! I thought this was a good one to post, having just returned from a camping trip last week and already dying to go on the next one.

Being able to start a fire is a super handy skill to have, more so when camping, but even when at someone's house with a fire pit of some sort. Most people who've never really started one think it's dead simple, only to fail miserably when they have to actually do it on their own. I know I did, at least.

Even if you've got fire starters and a deep fire pit to protect from any wind, if you don't do it right, it can take a lot more effort to get the fire going than is necessary. Ideally, I wanted to get to a point where I could start a fire with just a lighter (or matches) and not have to rely on anything else that I may either forget to take camping, or that I may run out of. When you go camping with friends who smoke, you're never short on lighters :)

The video that gave me all of the info that I needed was The Bird Nest and Tinder Bundle from Dave Canterbury (below). This is a great video about creating a bird nest to start a fire. From this video, you can skip the use of the char cloth to ignite the bundle and just use a Bic lighter, and I don't necessarily create my bird nest as big as in the video, but the practice of gathering or creating a bunch of fine, dry material for the bird nest - that'll catch fire quickly and easily - and gathering small sticks for kindling to then build on top of with larger logs is probably 95% of what I needed to reliably and consistently get a fire going. Another quick example can be seen in the Basic Camp Overnighter series too.

The Upside Down Fire is another good idea for getting a campfire going without needing much attention to maintain (while you go do other things), and I've started these a couple of times with great success. The biggest benefit to using this style of fire in winter is that the fire starts on a dry piece of wood, rather than on the ground, which may be wet or damp, and the above method of starting a fire with a bird nest is still relevant.