Thursday, December 16, 2010

Apache + Passenger + Sinatra

OK, this seems like a pretty normal situation, and like most normal situations it apparently never occurs in nature (judging by the available docs).

An existing Apache webserver is to have a new webapp added to it, in a subdirectory of the DocumentRoot. Sinatra is the framework to be used, and Passenger is going to route requests from Apache to Sinatra without going through any mod_proxy or mod_rewrite business.

Assume the following: Apache2, an Ubuntu system, and all of the relevant gems have been apt-get installed. The web server root directory is /var/www, and the webapp will be in /var/www/timon.

First, update the Apache config file (/etc/apache2/apache2/conf):

<VirtualHost *:80>
  ServerName Apemantus
  DocumentRoot /var/www

  <Directory />
    Options -Indexes
    AllowOverride None

  <Directory /var/www/>
    AllowOverride AuthConfig
    Order allow,deny

  SetEnv RUBYLIB '/var/www/timon'
  PassengerEnabled on
  PassengerAppRoot /var/www/timon
  RackBaseURI /timon


The bold lines indicate what must be added.
  • SetEnv : This just allows the code in subdirectories of the webapp to be easily required.
  • PassengerEnabled : Seems fairly obvious.
  • PassengerAppRoot : The full filesystem path to the webapp directory.
  • RackBaseURI : The relative (to DocumentRoot) path to the webapp directory.

Next, verify that the Passenger options (/etc/apache2/mods-enabled/passenger.conf) are correct:

<IfModule mod_passenger.c>

  PassengerRoot /usr
  PassengerRuby /usr/bin/ruby
  PassengerMaxPoolSize 10
  PassengerDefaultUser www-data

</IfModule mod_passenger.c>

These options are pretty straightforward. The only thing to note is that PassengerRuby can be set to a specific version of Ruby, e.g. jruby or ruby1.9.

Now, create an empty Rack-friendly directory structure for the webapp:

bash# cd /var/www
bash# mkdir timon
bash# mkdir timon/public
bash# mkdir timon/tmp

The use of public for static pages and tmp for restart.txt is well-documented.

Next, a Rack config file must be provided. This will be named (/var/www/timon/ and will have the following contents:

#!/usr/bin/env ruby
require 'rubygems'
require 'sinatra'

# Disable Sinatra's default Webrick instance
set :run, false

# Include timon.rb, the main webapp script
require 'timon'
run Sinatra::Application

Finally, the app itself must be provided. This will be named timon.rb (/var/www/timon/timon.rb) and will have the following contents:

#!/usr/bin/env ruby

require 'rubygems'
require 'sinatra'

get '/' do

# Local 404 handler
not_found do
  "Timon NotFound exception"

# Local error handler
error do
  "Timon Error: " + env['sinatra_error'].name

Debugging can be made a bit more straightforward by adding some basic logging code to

#!/usr/bin/env ruby
require 'rubygems'
require 'sinatra'

# Disable Sinatra's default Webrick instance
set :run, false

# Local logging
FileUtils.mkdir_p 'log' unless File.exists?('log')
log ='log/sinatra.log', 'a')

# Include timon.rb, the main webapp script
require 'timon'
run Sinatra::Application

That's it.

Nothing much to it, really, but the lack of RackBaseURI in the relevant examples really makes debugging this kind of thing difficult.

Tuesday, November 30, 2010

Ruby CGI and Javascript

After a brief perusal of the Ruby CGI module documentation, it doesn't seem like there is a good way to generate Javascript from within it.

Due to the method_missing way that the CGI module handles HTML tags, however, it turns out to be quite simple: invoke cgi.script, passing all tag parameters as a hash, and pass the Javascript code as a string in the block:

  cgi.out {
    cgi.html {
      cgi.head {
        cgi.title {
        } +
        # Javascript library to include
        cgi.script( 'src' => 'dygraph-combined.js',
                    'type' => 'text/javascript') {
      } +
      cgi.body { +
        cgi.div( 'id' => 'graphdiv' ) + +
        cgi.script( 'type' => 'text/javascript') {
          # Javascript to execute
          'g = new Dygraph(
                  "Date,Temperature\n" +
                  "2008-05-07,75\n" +
                  "2008-05-08,70\n" +

Monday, October 18, 2010

internationalization regex

Just a quick vim regex that's useful when adding internationalization to a file:


Wraps all quoted strings in a gettext() call, preserving inner (and escaped) quotes. Doing this in vim instead of sed allows a cursory review of the changes in case any of them * shouldn't* be changed (e.g. log strings).

Monday, August 16, 2010


I was astonished to learn how accomplished both of my children are in programming. It is a skill that an entire generation of adolescents is learning underground, much the way we used to pick up dirty words.

-- S. Milgram, 1982

Sunday, August 15, 2010

Controlling a command-line interpreter in Ruby

There seem to be a lot of posts to mailing lists, forums, etc in regards to controlling a child process in Ruby via STDIN and STDOUT. Most of the discussion (with this notable exception) ends with "use open3!" or "use open4!".

Anyone who has ever tried to control (i.e., tried to send more than one command to) an interpreter using either of these recognise their shortcomings immediately: the child process STDOUT cannot be read until STDIN has been closed.

Fortunately, the pty module (helpfully mentioned in the notable exception) allows the interpreter to be controlled properly as long as the OS supports psuedo-terminals -- that is, as long as it is a UNIX-like OS (i.e. Linux, OS X, *BSD... basically every desktop/server OS but Windows).

It is a bit tricky to get working well, due to the problem of not knowing for sure whether the child process is preparing more data to write to STDOUT, or is waiting for another command on STDIN.

The following implementation wraps the praat speech analysis software. It uses the praat prompt ('Praat > ') to determine when the child process is ready for more input (i.e., when it can stop reading from STDOUT).

#!/usr/bin/env ruby

require 'pty'

module Praat

  class Interpreter
    attr_reader :stdout, :stdin, :pid

    PROMPT='Praat > '

    def initialize(program='praat')
      @stdout, @stdin, @pid = PTY.spawn( 'praat', '-' )

      # Read initial prompt from pipe

      if block_given?
        yield self
        @stdin = @stdout = @pid = nil


    def read_until_prompt
      outbuf = buf = ''

      # Read from child STDOUT until > 0 bytes have been read
      # (i.e. wait for child process to finish reading input)
      while buf.length == 0
[@stdout])  # block until child process is ready
          @stdout.read_nonblock( 1024, buf )
        rescue Exception => e
          buf = ''              # READ failure! Try again.

      # Read from child STDOUT until 0 bytes are read or a line ending in a
      # prompt (i.e. next input prompt) was encountered.
      while buf.length > 0
        outbuf << buf

        # complete read if next input prompt is encountered
        break if outbuf =~ /#{PROMPT}$/

          buf = ''
          @stdout.read_nonblock( 1024, buf )
        rescue Errno::EAGAIN => e
[@stdout])  # block until child process is ready
        rescue Exception => e
          buf=''                # READ failure. Exit loop.


      # Return output of interpreter as an array of lines
      return outbuf.split("\n").each { |x| x.chomp! }

    # Send a command to the interpreter. Returns an array of the output.
    # If include_prompts is true, lines beginning with a prompt will NOT be
    # stripped from the output.
    def send( command, include_prompts=false )
      @stdin.write(command + "\n")

      outbuf = read_until_prompt

      # Ignore all ECHOed lines before the first (input) prompt
      first_prompt = outbuf.find_index { |x| x =~ /^#{PROMPT}/ }
      result = outbuf.slice(first_prompt, outbuf.length-first_prompt)

      # Return full results, or the results with prompt lines removed
      result = {|x| x !~ /^#{PROMPT}/} if not include_prompts

      yield result if block_given?

      return result



if __FILE__ == $0

  puts 'Testing block implementation' { |p| puts p.send('echo BLOCK TEST') } do |p|
    p.send('echo Full Output', true).each { |line| puts "\t" + line }

  puts 'Testing object implementation'
  praat =

  lines = []
  lines.concat praat.send('echo OBJ TEST 1' )
  lines.concat praat.send('echo OBJ TEST 2' )
  lines.concat praat.send('echo OBJ TEST 3' )

  puts lines.inspect

Examining .gem files

The gem(1) utility has a lot of useful commands for managing gem repositories. It only provides two commands for manipulating .gem files, however: build and install.

When building your own gems, or examining a gem before install, it is useful to be able to take a look at the contents without extracting the gem to a temporary directory.

The structure of the gem is simple: a tar file containing a tarball of the gem contents, and a compressed file containing the metadata:

bash# tar -tf /tmp/test.gem
tar: Record size = 19 blocks

Both tar and gzip support input and output on STDOUT, so it is easy enough to write shell functions to access the contents and metadata:

gem_contents () {
  tar -xOf $1 data.tar.gz | tar -ztf -
  return $?

gem_metadata () {
  tar -xOf $1 metadata.gz | gzip -d
  return $?

This provides quick access to the contents and metadata of a gem file:

gem_contents /tmp/test.gem
gem_metatdata /tmp/test.gem

Unit Testing in C

There are a number of unit test frameworks for C, but they all seem to lack something.

Sure, each can be used to perform automated testing, and some will even generate test stubs for you. But all of them are limited to API testing; that is, the only functions that can be unit-tested are the public API exported in the header files.

C differs from object-oriented languages in that most of the work -- the units to be tested, as it were -- is done in static functions that are not exported. This makes unit-testing, as it stands, all but useless from a C programmer's perspective; there is no compelling reason to use unit tests over the usual "test programs running on test data and returning 0 or 1 to the makefile" approach.

With some small work, however, it is possible to apply a unit test framework to a C project, and to perform actual unit tests (i.e. on static functions). It is a bit ugly, it is a bit invasive (naturally each .c file must contain its own unit tests), but it is scalable and maintainable.

Unit Testing with CHECK

check is a unit testing framework for C (manual). It is primarily designed to be used with the GNU autotools (autoconf, automake, etc), and the documentation is not clear on integrating check with a standard Makefile-based project. The unit tests developed here will serve as an example.

To begin, assume a project with the following directory structure:


The main program, the shared library, has its API in stuff.h, its code in stuff.c, and is built by the Makefile. The unit tests run by check will be in the tests director.y


#ifndef STUFF_H
#define STUFF_H

double do_stuff( double i );



#include <math.h>
#include "stuff.h"

static double step1( double i ) { return i * i; }
static double step2( double i ) { return i + i; }
static double step3( double i ) { return pow( i, i ); }

double do_stuff( double i ) { return step3( step2( step1( i ) ) ); }


# Library Makefile
# -------------------------------------------------------------------
NAME        =    stuff
LIBNAME     =    lib$(NAME).so
ARCHIVE     =    lib$(NAME).a

DEBUG       =    -DDEBUG -ggdb
OPT         =    -O2
ERR         =    -Wall
INC_PATH    =    -I.
LIB_PATH    =   

CC          =     gcc
LD          =    ld
AR          =    ar rc
RANLIB      =    ranlib
RM          =    rm -f

LIBS        =    -lm
CC_FLAGS    =     $(INC_PATH) $(DEBUG) $(OPT) $(ERR) -fPIC
LD_FLAGS    =    $(LIB_PATH) $(LIBS) -shared -soname=$(LIBNAME)

SRC         =     stuff.c

OBJ         =     stuff.o

#---------------------------------------------------------- Targets
all: $(LIBNAME)
.PHONY: all clean check

    $(AR) $(ARCHIVE) $^

    $(LD) $(LD_FLAGS) --whole-archive $< --no-whole-archive -o $@

.c.o: $(SRC)
    $(CC) $(CC_FLAGS) -o $@ -c $<

    [ -f $(LIBNAME) ] && $(RM) $(LIBNAME)|| [ 1 ]
    [ -f $(ARCHIVE) ] && $(RM) $(ARCHIVE)|| [ 1 ]
    [ -f $(OBJ) ] && $(RM) $(OBJ) || [ 1 ]
    cd tests && make clean

check: $(LIBNAME)
    cd tests && make && make check

So far, this is all pretty straightforward stuff. The Makefile in the tests directory will contain all of the check-specific settings.


# Unit-test Makefile
#--------------------------------------------------------- Definitions
TGT_NAME    =    stuff
TGT_SRC     =    ../stuff.c

OPT         =    -O2 -fprofile-arcs -ftest-coverage
ERR         =    -Wall
INC_PATH    =    -I. -I..
LIB_PATH    =    -L..
LD_PATH     =     ..

CC          =     gcc
RM          =    rm -f

# NOTE: check libs must be enclosed by --whole-archive directives
CHECK_LIBS  =    -Wl,--whole-archive -lcheck -Wl,--no-whole-archive
LIBS        =    -lm $(CHECK_LIBS)  

# NOTE: UNIT_TEST enables the static-function test case in stuff.c
# NOTE: check libs must be enclosed by --whole-archive directives
LD_FLAGS    =    $(LIB_PATH)

# Test Definitions (to be added later)
TESTS       = 

#---------------------------------------------------------- Targets
all: $(TESTS)
.PHONY: all clean check

    $(RM) $(TESTS) *.gcno *.gcda

        @for t in $(TESTS); do                          \
                LD_LIBRARY_PATH='$(LD_PATH)' ./$$t;     \

There are a few things to take note of here.

First, the compiler options profile-arcs and test-coverage cause check to perform test coverage profiling. See the the manual for details.

Next, libcheck.a (there is no .so) is added to the linker options, enclosed in --whole-archive directives so that the linker will not discard what it thinks are unused object files. This last point is important; when linking a static library into a dynamic library, --whole-archive must be used.

Finally, the preprocessor definition UNIT_TEST is added to the compiler options. This will be used to ensure that unit tests do not get built into a distribution version of

Simple Unit Test

The first unit test will be a simple one that ensures the library links with no errors. This is a common problem when developing a shared library; unresolved symbols will not be reported until an executable is linked to the library.


#include <stuff.h>

int main(void) { return (int) do_stuff( 32.0 ); }

Add a make target for this first test.


# Test 1 : Simple test to ensure that linking against the library succeeds
TEST1        =    test_link
TEST1_SRC    =    test_link.c

TEST1_LIBS   =    $(LIBS) -l$(TGT_NAME)

TESTS        =    $(TEST1)

# ...

$(TEST1): $(TEST1_SRC)
    $(CC) $(TEST1_FLAGS) -o $@ $^ $(TEST1_LIBS)

Note that the lines before # ... go in the definitions (top) part of the Makefile, while the lines after it go in the targets (bottom) part of it.

This first test can now be run:

bash# make check
gcc -I.  -DDEBUG -ggdb -O2  -Wall -fPIC -o stuff.o -c stuff.c
ar rc libstuff.a stuff.o
ranlib libstuff.a
ld  -lm  -shared --whole-archive libstuff.a --no-whole-archive -o
cd tests && make && make check
make[1]: Entering directory `stuff/tests'
gcc -I. -I.. -O2 -fprofile-arcs -ftest-coverage -Wall -DUNIT_TEST -L.. -Wl,--whole-archive -lstuff -lcheck -Wl,--no-whole-archive  -o test_link test_link.c
make[1]: Leaving directory `stuff/tests'
make[1]: Entering directory `stuff/tests'
make[1]: Leaving directory `stuff/tests'

Success! Note that the link test does not involve check; make will error out if the linking fails. Pedantic unit testers may want to take the route of making the test fail first (by invoking, say, do_nothing() in test_link.c) in order to convince themselves that it works.

Testing Static Functions

Testing a static function requires embedding the test code in the file that contains the function. The UNIT_TEST preprocessor directive will prevent the unit test from being compiled in non-unit-test binaries.

First, append the test code to the end of the library code.


/* ================================================================= */
#ifdef UNIT_TEST
#include <check.h>
<stdlib.h>    /* for rand() */

START_TEST (test_step1)
    double d = (double) rand();
    fail_unless( step1(d) == (d * d), "Step 1 does not square" );

START_TEST (test_step2)
    double d = (double) rand();
    fail_unless( step2(d) == (d + d), "Step 2 does not double" );

START_TEST (test_step3)
    double d = (double) rand();
    fail_unless( step3(d) == pow(d, d), "Step 3 does not exponentiate" );

TCase * create_static_testcase(void) {
    TCase * tc = tcase_create("Static Functions");
    tcase_add_test(tc, test_step1);
    tcase_add_test(tc, test_step2);
    tcase_add_test(tc, test_step3);
    return tc;


The START_TEST and END_TEST macros are provided by check, as are the tcase_ routines. The strategy here is to define a single exported function, create_static_testcase, which the unit-test binaries will link to.

The next step is to add a unit test suite to the test directory.


#include <check.h>
#include <math.h>
#include <stdlib.h>

#include <stuff.h>

#define SUITE_NAME "Stuff"

/* ----------------------------------------------------------------- */
/* TESTS */

START_TEST (test_stuff_diff)
    double d = (double) rand();
    fail_unless ( do_stuff(d) != d, "do_stuff doesn't!" );

START_TEST (test_stuff_rand)
    double d = (double) rand();
    double dd = d * d;
    double sumdd = dd + dd;
    fail_unless ( do_stuff(d) == pow(sumdd, sumdd), "Incorrect result" );

/* ----------------------------------------------------------------- */
/* SUITE */

extern TCase * create_static_testcase(void);

Suite * create_suite(void) {
    Suite *s = suite_create( SUITE_NAME );

    /* Create test cases against library API */
    TCase *tc_core = tcase_create ("Core");
    tcase_add_test(tc_core, test_stuff_diff);
    tcase_add_test(tc_core, test_stuff_rand);
    suite_add_tcase(s, tc_core);

    /* Create test cases against static functions */
    suite_add_tcase( s, create_static_testcase() );

    return s;

int main( void ) {
    int num_fail;
    Suite *s = create_suite();
    SRunner *sr = srunner_create(s);
    srunner_run_all (sr, CK_NORMAL);
    num_fail = srunner_ntests_failed (sr);
    srunner_free (sr);
    return (num_fail == 0) ? EXIT_SUCCESS : EXIT_FAILURE;

This file creates a test suite that contains two tests of the library API (defined in test_stuff.c) and three tests of the static functions (defined in stuff.c). The entire suite will be run when the unit-test binary is executed.

Finally, the new test must be added to tests/Makefile, as discussed earlier.


# Test 2 : Actual unit tests against source code
# NOTE: this cannot link against the library as it incorporates the source code
TEST2        =    test_$(TGT_NAME)
TEST2_SRC    =     $(TEST2).c \

TEST2_LIBS   =    $(LIBS) 

TESTS        =    $(TEST1) $(TEST2)

# ...

$(TEST2): $(TEST2_SRC)
    $(CC) $(TEST2_FLAGS) -o $@ $^ $(TEST2_LIBS)

Now all of the tests will be run when the make target is invoked:

gcc -I.  -DDEBUG -ggdb -O2  -Wall -fPIC -o stuff.o -c stuff.c
ar rc libstuff.a stuff.o
ranlib libstuff.a
ld  -lm  -shared --whole-archive libstuff.a --no-whole-archive -o
cd tests && make && make check
make[1]: Entering directory `stuff/tests'
gcc -I. -I.. -O2 -fprofile-arcs -ftest-coverage -Wall -DUNIT_TEST -L.. -Wl,--whole-archive -lstuff -lcheck -Wl,--no-whole-archive  -o test_link test_link.c
gcc -I. -I.. -O2 -fprofile-arcs -ftest-coverage -Wall -DUNIT_TEST -L.. -Wl,--whole-archive -lstuff -lcheck -Wl,--no-whole-archive  -o test_stuff test_stuff.c ../stuff.c
make[1]: Leaving directory `stuff/tests'
make[1]: Entering directory `stuff/tests'
Running suite(s): Stuff
100%: Checks: 5, Failures: 0, Errors: 0
make[1]: Leaving directory `stuff/tests'

Wednesday, March 17, 2010

Up and running with Ruport

The need to revamp an aging report generator at work led to a bit of research and thus to Ruport, an impressive Ruby framework for report generation.

Ruport has suffered the same fate as many a young open source project: fast and frenzied development ("release early and often" and all that cal) has resulted in enough API changes that many of the examples and documentation useless. The best info to date is in the Ruport Book, but even this book does not get one up-and-running with anything more than the most trivial projects (e.g. printing reports for a CSV table).

Let's define "up and running" as "generating reports from data in an existing database" -- the problem that most people are likely trying to solve, but which somehow doesn't appear as a ready example in most of the Ruport documentation. What follows is a quick demonstration of getting Ruport "up and running".

First, install the requisite gems. Note: PostgreSQL is being used as the database in this example instead of MySQL, both because it is a better RDBMS and because the world has enough MySQL examples already.

gem install ruport      
# DBI support                                                      
gem install ruport-util
gem install pg
gem install dbi
gem install dbd-pg
# ActiveRecord and AAR support
gem install activerecord
gem install acts_as_reportable
# Extras - not covered in this example
gem install gruff
gem install documatic

The database is assumed to have the following configuration:

* host : db
* port : 15434
* database: stuff
* user: dba
* password : secret

The 'stuff' database has the table 'complaint' which will the reports will pull data from. Note that the name is not 'complaints', as ActiveRecord would expect (and create) -- the point here is to connect to an existing database, most likely managed by someone else (who, even more likely, is neither a Rails developer nor a fan of the 'database is subservient to the code' approach).

Generating a Report Using the Ruby DBI

Connecting to a database table via the Ruby DBI requires the Ruport::Query class, which has been moved to the ruport-util gem. Ruport::Query maintains a list of named database sources, with the default source being named, surprisingly, :default.

Generating a report therefore becomes a matter of adding a database connection via Ruport::Query#add_source, then instantiating a Ruport::Query and printing its result:

require 'rubygems'
require 'ruport'
require 'ruport/util'

Ruport::Query.add_source( :default, 

                          :user => 'dba', 
                          :password => 'secret',
                          :dsn => 'dbi:Pg:database=stuff;host=db;port=15434' 

puts [['select * from complaint']] ).result

All in all, nice and straightforward, except for that argument to Ruport::Query#new. According to the docs, this should be as simple as 'select * from complaint' )

...but this fails, and a look at the source shows that Ruport::Query invokes the each() method of the sql argument (which, being a String, as no each() method). To make matters worse, if sql is an array, then the each() method of sql.first() is invoked, so the nested array is needed. This is very likely a bug, and may be fixed in future versions, rendering this example and caveat obsolete.

Generating a Report Using ActiveRecord

The ActiveRecord support in Ruport is a bit more complicated. In particular, the acts_as_reportable (AAR) mixin must be added to every ActiveRecord table class.

Note that with ActiveRecord, a table must be wrapped by a class derived from ActiveRecord::Base. For existing databases, this class can be empty -- although it may be necessary to override the table_name method if the database schema does not conform to ActiveRecord's expectations.

require 'rubygems'
require 'ruport'
require 'ruport/acts_as_reportable'

ActiveRecord::Base::establish_connection( :adapter='postgresql',

                                          :username => 'dba',
                                          :password => 'secret' )

class Complaint < ActiveRecord::Base
   def table_name()
       return 'complaint'

puts Complaint.report_table

The report_table method is what AAR exposes to generate a Ruport report for the table.

These quick examples should suffice to get Ruport connected to an existing database without spelunking through the docs and gem source. Once connected, it's simply a matter of playing with Ruport and following the (rest of the) docs.