Category Archives: Programming

CasperJS: Learn, and become mighty.

Time and time again I find myself needing to automate some time-consuming task in a cumbersome web-UI. Time and time again, I dig out what has become one of my most trusted automation tools:



Now, if there is one thing that separates a good SysAdmin from a bad one, it’s ambitious laziness. Minimizing (or eliminating) the time necessary to do repetitive tasks is the hallmark of a good SA, and nothing fits the ‘automate web shit’ rung in the tool belt quite as well as CasperJS

Headless WebKit with JavaScript interaction

I first came across CasperJS back in my days of heavy web-scraping. At the time I had developed a reliance on a horrible amalgamation of cURL and Regex to accomplish what I needed, but as websites got more sophisticated (and as their maintainers developed a diminishing tolerance for my shit), it became clear that I needed something better.

CasperJS is effectively an otherwise full, headless browser with an external interface for interacting with the pages via JavaScript – imagine GreaseMonkey without the browser. With CasperJS, you can automate endlessly complex interactions with dynamic web content directly from a command line.

I won’t bore you with the beginning parts (their website has fantastic documentation), but here are a couple idiosyncrasies as well as a few cool tricks I use often when dealing with web-UI automation.

Problem: Variables generally don’t persist
I discovered this almost immediately. Often, I’ll need to retrieve complex datasets from the page in a way that would typically necessitate a for() loop (think tables full of data). The problem is, you can only return content from the page as a result of an evaluate()’d function, and variables defined in evaluate() scope don’t inherently persist on the page. With that being the case, how can I get complex datasets out of the page, without writing dozens of evaluate() statements?

Solution:,, or
Imagine that we want to retrieve a set of key:value pairs from some HTML-formatted data on the page. For the sake of argument, we’ll say it’s names and phone numbers. We inspect the relevant HTML, and find this:

   <td class="name">John Smith</td>
   <td class="number">310-555-2811</td>
   <td class="name">Jason Rogers</td>
   <td class="number">404-555-8437</td>

Well, this is pretty handy. There’s one name:number pair within each <tr>, and the values are classed with ‘name’ and ‘number’ respectively. Check this out:

var casper = require('casper').create();
casper.start('whatever_url.html', function() {
   console.log(this.evaluate(function() {
      return JSON.stringify($('tr'), function(e) {
         name = $(e).find('').html();
         number = $(e).find('td.number').html();
         return { name: name, number: number }

This might look complex, but it’s mostly just the wrapping necessary in order for CasperJS to pass it into the page. The important part is the – it loops over each ‘$(tr)’ element, passing it as ‘e’ into the supplied function. Boom. With a single, self-regulating command, we have just pulled the following result:

[{"name": "John Smith", "number": "310-555-2811"},
 {"name": "Jason Rogers", "number": "404-555-8437"}]

Damn. Doesn’t get much easier than that, does it? This resultset is now free from the confines of its original HTML and is universally transportable as JSON. Glorious.


Problem: Logging in, even with –cookies-file defined, doesn’t seem to persist across multiple scripts!
Yeah, this is a bummer. However, I’ve found a workaround.

Solution: ‘fs’
Yup. Those two letters hold the key to your salvation. First off, let’s create a cookie file (casperjs_cookies.txt) with an empty JSON string:


Now, we’ll drop the following code segment into the beginning of our script:

var fs = require('fs');'casperjs_cookies.txt')), function(x){

This will read the file into phantomjs’ cookies, making them present to the page. However, since we created the file empty, we need to make sure that the cookie gets written at an appropriate time – inject this into your script after verifying a login:

fs.write('casperjs_cookies.txt', JSON.stringify(phantom.cookies), "w");

Once again: Boom. Now when you run this script in the future, it will reuse the cookies from the last script and retain any session states you created in the browser.

Injecting variables into anonymous functions in PHP

I was coding up some tools yesterday, and decided to write a function in PHP that sorted an array numerically by a provided subkey name.

Assume an array with the following structure:

$sale_items = array(
   "first" => array(
      "value" => "100",
      "color" => "blue"
   "second" => array(
      "value" => "105",
      "color" => "red"

uasort can easily handle this if you already know which subkey you intend to sort by:

$sorted = uasort($sale_items, function($a, $b) {
   return ($a['value'] > $b['value']);

In this case though, I wanted to make this function reusable. Unfortunately the uasort function passes the parameters the provided function on its own – how do we inject a variable into that anonymous function? Turns out that it’s insanely easy. Check it:

$sorted = uasort($sale_items, function($a, $b) use ($field) {
   return ($a[$field] > $b[$field]);