The Sorry Scheme of Things Entire

Monday, December 21, 2009

The importance of documentation

Went back to Python after three or so years of Ruby, and found immediately that loading modules from . is now broken.

Checked the docs, which are in a horrible state (it's in the Language Ref, no wait it's in the Library Ref even though it's a language feature, no wait it's in a PEP, no wait that is out of date and the real documentation has been posted to a mailing list), and found stuff like this:

From Python Docs: The Module Search Path:

"When a module named spam is imported, the interpreter searches for a file named spam.py in the current directory, and then in the list of directories specified by the environment variable PYTHONPATH. This has the same syntax as the shell variable PATH, that is, a list of directory names. When PYTHONPATH is not set, or when the file is not found there, the search continues in an installation-dependent default path; on Unix, this is usually .:/usr/local/lib/python.

"Actually, modules are searched in the list of directories given by the variable sys.path which is initialized from the directory containing the input script (or the current directory), PYTHONPATH and the installation- dependent default."

Well, that's the theory at any rate. Let's test it with actual code:

# mkdir /tmp/test-py
# cd /tmp/test-py
# mkdir -p snark/snob
# touch snark/__init__.py snark/snob/__init__.py snark/snob/stuff.py
# echo -e '#!/usr/bin/env python2.5\nimport os\nprint(os.getcwd())\nimport snark.snob.stuff\n'> a.py
# chmod +x a.py
# mkdir bin
# cp a.py bin/b.py

What would you expect to happen when this is run? Surely having the module in the current directory means that running either a.py or b.py from . will work, right?

# ./a.py
/tmp/py-test
# ./bin/b.py
/tmp/py-test
Traceback (most recent call last):
File "./bin/b.py", line 4, in
import snark.snob.stuff
ImportError: No module named snark.snob.stuff

Nope! And look at that -- according to Python's own system command, the current working directory (as mentioned in their docs) is the same for both scripts!

Explicitly setting PYTHONPATH fixes this:

# PYTHONPATH=. bin/b.py
/tmp/py-test

..but really, shouldn't the docs be a bit less misleading?

And, for that matter, why the hell is the location of the script used instead of the working directory anyways? Either be sane and use the current working directory, or be slightly less sane and check both.

The whole reason for using Python on this project in the first place was to revert to a language and interpreter that is more stable and more professional (in terms of management style) than Ruby, but this kind of crap certainly gives one pause. If thirty minutes with the language turns up something as obviously broken as the module loader, one shudders to think what is going to come up over the course of the project.

Sad days for the Python boys. The interest of Fowler and his crew must have really given Ruby a leg up in terms of quality.

Sunday, December 6, 2009

'standalone' rattle

Rattle is a decent data-mining utility built on top of GNU R, with one small problem: it is impossible to run outside of the R terminal (e.g. from the shell or a desktop icon/menu).

What this means, on Linux, is that to run Rattle one must start R in a terminal and enter the following commands:

> library(rattle)
Loading required package: pmml
Loading required package: XML
Rattle: Graphical interface for data mining using R.
Version 2.5.3 Copyright (C) 2006-2009 Togaware Pty Ltd.
Type 'rattle()' to shake, rattle, and roll your data.
> rattle()

A Ctrl-D to log out of R is also required.

The big problem (ignoring R's inconsistent handling of stdin/out, which can mostly be solved by using littler) is the R Gtk module starts a separate GUI thread, and quitting R does not do a wait on child threads -- it just kills them.

A workaround to launch rattle from outside of R is to provide a simple wrapper script:

#!/usr/bin/env ruby

require 'open3'

r_prog = `which R`.chomp

Open3.popen3( "#{r_prog} --interactive --no-save --no-restore --slave" ) do | st
din, stdout, stderr |
stdin.puts "library(rattle)"
stdin.puts "rattle()"
puts stdout.read
end

This runs R and launches rattle(), but leaves an R process open in the background once rattle() has exited.

Adding a q() call to the script merely terminates the Rattle GUI child thread. In addition, rattle provides no indication that it is running:

> ls()
character(0)
> library(rattle)
Loading required package: pmml
Loading required package: XML
Rattle: Graphical interface for data mining using R.
Version 2.5.3 Copyright (C) 2006-2009 Togaware Pty Ltd.
Type 'rattle()' to shake, rattle, and roll your data.
> ls()
[1] "Global_rattleGUI" "crs"
[3] "crv" "on_aboutdialog_response"
[5] "rattleGUI" "suppressRattleWelcome"
[7] "viewdataGUI"
> rattle()
> ls()
[1] "Global_rattleGUI" "crs"
[3] "crv" "on_aboutdialog_response"
[5] "rattleGUI" "suppressRattleWelcome"
[7] "viewdataGUI"
# Use Ctrl-Q to exit the Rattle GUI
> ls()
[1] "Global_rattleGUI" "crs"
[3] "crv" "on_aboutdialog_response"
[5] "rattleGUI" "suppressRattleWelcome"
[7] "viewdataGUI"

Once the Rattle library has loaded, there is no way to tell from within R if the Rattle GUI is running or not. Modifying Rattle to create a variable when the GUI loads successfully, and to remove it when the GUI quites, would undoubtedly fix this.

Wednesday, November 25, 2009

Too clever by half

It's always frustrating to come across a build error in allegedly-portable code, then to discover that the root of it is someone complicating things in a misguided attempt to improve portability -- usually by introducing complexity instead of removing it.

The macports project in particular is rife with the problems, undoubtedly because most of the "cross-platform" open source projects it ports never considered OS X as a build target.

KDE4 has refused to build on a particular OS X box (Quad G5, 10.5.8, XCode 3.1.4) via macports for quite some time. Currently, the build fails in kdebase4-runtime due to an error in the fishProtocol ctor in fish.cpp:

/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.cpp: In constructor 'fishProtocol::fishProtocol(const QByteArray&, const QByteArray&)':
/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.cpp:275: error: cannot convert 'const char* (*)()' to 'const char*' for argument '1' to 'size_t strlen(const char*)'
/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.cpp: In member function 'void fishProtocol::manageConnection(const QString&)':
/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.cpp:1079: error: no matching function for call to 'fishProtocol::writeChild(const char* (&)(), int&)'
/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.cpp:561: note: candidates are: void fishProtocol::writeChild(const char*, KIO::fileoffset_t)
make[2]: *** [kioslave/fish/CMakeFiles/kio_fish.dir/fish.o] Error 1
make[1]: *** [kioslave/fish/CMakeFiles/kio_fish.dir/all] Error 2
make: *** [all] Error 2

Little did I know at the time that the actual cause for the error was further up:

[ 57%] Generating fishcode.h
cd /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish && /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/generate_fishcode.sh /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.pl /opt/local/bin/md5 /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/build/kioslave/fish/fishcode.h -f\ 4
sed: 1: "s/\\/\\\\/g;s/"/\\"/g;s ...": unescaped newline inside substitute pattern

The line in question (fish.cpp:275) does a strlen on fishCode, which is not defined anywhere, though there is an #include for fishcode.h .

I searched around, found fishcode.h, and it was close to empty:

#define CHECKSUM "'fb18e850532f42b49f1034b4e17a4cdc /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.pl'"
static const char
*fishCode('
');

Something had gone horribly wrong, somewhere -- fishCode is empty, though that shouldn't cause the error being displayed. Still, builds get wacky, and it's a lead.

Googling around for a copy of fishcode.h, turns up nothing, which is not a surprise as it is obviously autogenerated by the build process. What does turn up, however, are the commands used to generate it:

SUM=`/usr/bin/md5sum
/usr/src/packages/BUILD/kdebase-683581/kioslave/fish/fish.pl | cut -d ' ' -f 1`;
echo '#define CHECKSUM "'$SUM'"' > fishcode.h; echo 'static const char
*fishCode(' >> fishcode.h; sed -e 's/\\/\\\\/g;s/"/\\"/g;s/^[ ]*/"/;/^"# /d;s/[
]*$/\\n"/;/^"\\n"$/d;s/{CHECKSUM}/'$SUM'/;'
/usr/src/packages/BUILD/kdebase-683581/kioslave/fish/fish.pl >> fishcode.h; echo
');' >> fishcode.h;

Time to set SUM in the shell and started playing with the sed command, which is obviously what's failing:

bash-3.2$ SUM='fb18e850532f42b49f1034b4e17a4cdc /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.pl'

bash-3.2$ sed -e 's/\\/\\\\/g;s/"/\\"/g;s/^[ ]*/"/;/^"# /d;s/[
> ]*$/\\n"/;/^"\\n"$/d;s/{CHECKSUM}/'$SUM'/;'

sed: 1: "s/\\/\\\\/g;s/"/\\"/g;s ...": unbalanced brackets ([])

Something is going wrong, set SUM to a more sane value:

bash-3.2$ SUM='fb18e850532f42b49f1034b4e17a4cdc'

bash-3.2$ sed -e 's/\\/\\\\/g;s/"/\\"/g;s/^[ ]*/"/;/^"# /d;s/[> ]*$/\\n"/;/^"\\n"$/d;s/{CHECKSUM}/'$SUM'/;'

sed: 1: "s/\\/\\\\/g;s/"/\\"/g;s ...": unbalanced brackets ([])

bash-3.2$ SUM='/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.pl'

bash-3.2$ sed -e 's/\\/\\\\/g;s/"/\\"/g;s/^[ ]*/"/;/^"# /d;s/[
> ]*$/\\n"/;/^"\\n"$/d;s/{CHECKSUM}/'$SUM'/;'

sed: 1: "s/\\/\\\\/g;s/"/\\"/g;s ...": unbalanced brackets ([])

Try replacing the embedded newline with a good old '\n':

bash-3.2$ SUM='/opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.pl'

bash-3.2$ sed -e 's/\\/\\\\/g;s/"/\\"/g;s/^[ ]*/"/;/^"# /d;s/[\n]*$/\\n"/;/^"\\n"$/d;s/{CHECKSUM}/'$SUM'/;'

sed: 1: "s/\\/\\\\/g;s/"/\\"/g;s ...": bad flag in substitute command: 'o'

Sed doesn't like the embedded newline; not a surprise. Now it seems like SUM is being interpreted as part of the expression. Let's see what's really going on:

bash-3.2$ SUM='fb18e850532f42b49f1034b4e17a4cdc /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.pl'

bash-3.2$ echo sed -e 's/\\/\\\\/g;s/"/\\"/g;s/^[ ]*/"/;/^"# /d;s/[> ]*$/\\n"/;/^"\\n"$/d;s/{CHECKSUM}/'$SUM'/;' sed -e s/\\/\\\\/g;s/"/\\"/g;s/^[ ]*/"/;/^"# /d;s/[
]*$/\\n"/;/^"\\n"$/d;s/{CHECKSUM}/fb18e850532f42b49f1034b4e17a4cdc /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_ports_kde_kdebase4-runtime/work/kdebase-runtime-4.3.3/kioslave/fish/fish.pl/;

Argh! Someone's single quotes aren't properly escaped! Right there at '$SUM'!

Simple enough to fix, though. Fortunately, using sed to generate fishcode.h manually allows the kdebase4-runtime build to continue, which saves the time of tracking down where in the build process that sed command is.

It does raise the question, though: was any of this autogeneration really necessary? And if so, wouldn't a perl script (or even a one-liner) be more reliable?

One can argue that perl does not ship on as many OSes as sed (debatable), but the fact that this build worked on a maintainer's machine and not on an end-user's machine (running an OS highly regulated by the manufacturer, no less) pretty much blows the portability argument out of the water: if that's what the maintainer is truly after, well, their methods aren't working either.

OK, end angry-at-overly-convoluted-open-source-code-again rant. The moral: verify your goddamn shell commands!

Saturday, September 12, 2009

Ubuntu, bluetooth, and the E70

A few quick notes on connecting a phone (in this case the venerable Nokia E70, the last with the excellent gull-wing keyboard design) to ubuntu via bluetooth.

Assuming the kernel modules and all necessary utilities (bluetooth, bluez, obexftp, obexfs, obecpushd, obex-data-server, irda-utils, cobex, gammi, kbluetooth) are installed, the first step is to create an rfcomm device:

mknod /dev/rfcomm0 c 216 0

Next, bind the device to the phone's bluetooth MAC address:

sudo rfcomm bind 0 00:12:D1:AD:FD:5E

The MAC address can be obtained using

hcitool scan

Note that the link and the bind have to be performed at boot; the file /etc/bluetooth/rfcomm.conf can be modified to perform this (it contains a sufficiently clear example).

Once these steps are performed, utilities like cobex and kbluetooth should just work.

With the E70, though, they don't, so it's necessary to use Obex to get data to and from the device.

The standard, longhand way to do this is with obex-ftp:

obexftp -b $MAC_ADDR -B 11 -l

The -b option is the bluetooth MAC address; the -B option is the channel (usually 10 or 11), and the -l option is the command to execute (ls). The man page lists commands.

Better that obexftp is obexfs, which mounts the phone as a filesystem:

mkdir ~/e70
obexfs -b $MAC_ADDR -B 11 ~/e70

This mounts the phone at the mount point ~/e70, where its phone and MMC memory can be accessed directly.

Tuesday, September 8, 2009

Doxygen gotchas

Finally took an afternoon to learn doxygen and have been commenting up the latest project for the past 24 or so hours.

Doxygen is easy to get up to speed with, but there are a couple of gotchas that aren't made clear in the documentation. These are listed below in no particular order.

Standalone pages

This is the first thing one notices when mucking about with doxygen: how can the index.html (aka "Main Page") be modified?

The answer is just as quick to find: create a document with a \mainpage tag. But what document? Surely adding a header file just to create doxygen tags is a bit silly?

Indeed. Instead, create a directory (e.g. doc) and use it for standalone doxygen files. A standalone file is basically a text file consisting of only the C++ comment, e.g.:

/*!
 \mainpage The Main Page
 \section sec_1 Section 1
 ...section 1 stuff...
 \section sec_2 Section 2
 ... section 2 stuff...

<HR>
<b>\ref todo%lt;/b>
*/

Name this something like 'main.dox', and add the .dox extension to the FILE_PATTERNS definition in the project Doxyfile:

FILE_PATTERNS += *.dox

These standalone doxygen files are incredibly useful for stuff like installation instructions, HowTos, and FAQs.

They are also useful for defining groups. A file groups.dox can contain group definitions like the following:

/*!                                                                         
\defgroup outer_group "Outer Group"
This module contains outer-group stuff.

\defgroup inner_group "Inner Group"
\ingroup outer_group
These things are in a group in a group.
*/

This provides a single, easy-to-maintain place for group definitions, so that header files and such just have to use "\ingroup". It seems like pretty good practice to put most files and classes in groups -- it makes for more expedient browsing.

Global namespace

Ok, there's a nice 'Namespaces' folder in the tree pane, and what does it contain? The project namespace. What about all those singletons ill-advisably implemented as globals (purely for narrative effect)?

Turns out there is no way to list the global namespace. This seems like a huge oversight -- if there's one thing you want to know about a project, it's how many lame globals the developers used.

Adding a 'globals' page seems like a good way to circumvent it, except for one slight problem -- if you create a standalone doc with "\page globals Global Namespace" in it, doxygen creates a page in the tree called "Global Namespace" ... with its own internal(?) version of the global namespace in it. This means, basically, that globals defined in your project are not there -- it only contains stuff like (for example) qRegisterMetaType invocations. It looks like 'globals' is an undocumented, and not particularly working, doxygen 'special command'.

A workaround is to use xrefitems. Add a line like the following to doxygen (remember, 'globals" is not an allowed name):

ALIASES                +=  "globalfn=\xrefitem global \"Functions\" \"Globals\""

Now, use code like the following to document your global function:

/*! \fn void GlobalSingletonFactoryMess( bool suck_less )
       \globalfn void GlobalSingletonFactoryMess( bool suck_less )
       \param suck_less : Make singletons suck less
*/
void GlobalSingletonFactoryMess( bool );

A page called 'Globals' will appear under 'Related Pages', and will contain the function prototype, with a link to its documentation in the appropriate .h file. Of course, it's called a 'Member', but one can't have everything.

Xrefitems

Speaking of xrefitems, see that second argument in the ALIASES line above? The one that's set to "Functions", and is supposedly the header for the section that contains the xref items?

Yeh, that argument does nothing. Doxygen ignores it. Go on, try it. Set it to "Doxygen, please format my hard drive". Or have it make fun of your boss or users. It makes no difference. That text will never appear.

Multiple namespaces in header files

This one is just plain wacky. Or rather, it's just plain lazy. Of the doxygen parser.

Let's say you have a header file where you declare an interface and an abstract base class implementing that interface (never mind why, it's an example):


namespace Mine {
class MyInterface {
public:
virtual ~MyInterface() {}
virtual void doStuff() = 0;
};
} /* close namespace */

/* interface must be registered in gobal namespace */
Q_DECLARE_INTERFACE( Mine::MyInterface, "com.me.Mine.MyInterface/1.0");

namespace Mine {
class MyClass : public QObject, public MyInterface {
Q_OBJECT
Q_INTERFACES(Mine::MyInterface)
public:
MyClass( QObject *parent=0 );
virtual void doStuff();
};
}

Guess what happens? MyClass never appears in the documentation. The second namespace block leaves the doxygen parsers as befuddled as ... your favorite befuddlement simile.

The solution of course is to put the interface class in its own header file, which is no big deal ... but really, you shouldn't *have* to.

Random \example tag links

OK, there's an example in docs/examples/, that dir is safely added to the Doxyfile EXAMPLE_PATH, and the file shows up where it's supposed to in the doc tree under Examples.

Looking at the class for which it is an example, though, you see nothing -- or maybe it gets linked to a few rather arbitrary methods, or to methods outside of the class altogether.

What's going on?

Well, it turns out that doxygen decides to out-clever you, and only put a link to the example file in elements that appear in the example. Thus, if your class never directly appears as a type in the file (e.g. 'PluginManager::plugin("MyPlugin").status()' appears instead of 'Plugin p(PluginManager::plugin("MyPlugin"); p.status()'), then the documentation for your class will not be linked to the example, no matter how close to the \class tag you put the \example tag. Doxy knows best, eh? Surely you have no idea where you want your examples linked from.

The fix is to rewrite the example so that the class/function/whatever appears clearly and distinctly.

Tuesday, August 11, 2009

Using qmake without a build target

One of the greatest weaknesses of Qt's qmake utility has been its over-reliance on templates. The available templates are standard stuff for Qt projects: app, lib, and subdirs (new with Qt4, I believe). These are all well and good when one is just slapping together a bunch of Qt code, but what about when interfacing with other build systems? A command template would be quite useful.

The follow project file is a workaround. The lib template is used (although app could be used as well) with empty HEADERS and SOURCES variables. The qmake variables that specify which build tools to use are set to a NOP (no-operation) tool such as echo or true. An "extra" target is created for the command to be executed, and set to be a pre-dependency of the main target.

Background: There is a config.py in . that has sip generate all C++ code, as well as the Makefile, in the directory 'sip'. The python module (sip_test.so) is output to the same directory. This project filed is invoked, along with many others, from a subdirs project file in the parent directory.



-----------------------------------------------------
# sip_test.pro
TEMPLATE    = lib
TARGET      = sip_test

# Check that sip exists
unix {
      !system(sip -V > /dev/null) {
              error( "SIP does not exist!" )
   }
}

# sip build command and makefile targets
sip.commands          =    python ./config.py && cd sip && make
QMAKE_EXTRA_TARGETS   +=   sip
PRE_TARGETDEPS        =    sip

# compiler toolchain override
unix {
     # 'true' should exist on Unix and OS X. Win32 folk can
     # use TYPE.EXE or something.
      NOP_TOOL    = true
}

QMAKE_CC        =    $$NOP_TOOL
QMAKE_CXX       =    $$NOP_TOOL
QMAKE_LINK      =    $$NOP_TOOL
QMAKE_LN_SHLIB  =    $$NOP_TOOL

# Copy the .so created by sip.commands to the current directory
QMAKE_POST_LINK        =    cp sip/$${TARGET}.so .
-----------------------------------------------------

With this project file, the standard "qmake && make" command works as expected.

Saturday, May 30, 2009

Toshiba R500 Backlight Button

Keep forgetting to u/l this script for toggling the r500 backlight using toshset:

#!/bin/sh

TOSHSET=/usr/local/bin/toshset

STATE=`sudo $TOSHSET -trmode 2>/dev/null | grep -a transreflective | \
cut -f 3 -d ' '`

if [ $STATE = "on" ]
then
NEW_STATE="off"
else
NEW_STATE="on"
fi

sudo $TOSHSET -trmode $NEW_STATE

Xev shows the keycode for the backlight (and the 'i' internet/information/idiot key, so they'll both do the same thing) to be 180, which is already taken by XF86HomePage in Ubuntu. If for some reason this key is undefined (i.e. nothing shows up on `xmodmap -pke | grep '^keycode 180'`), add the following line to ~/.Xmodmap (which might need to be symlinked to ~/.xmodmaprc on non-Ubuntu Linuxes):

keycode 180 = XF86HomePage

In E17, edit the keyboard bindings using Settings-> Settings Panel -> Input -> Key Bindings. For XF86HomePage, select the Launch : Defined Command option, and point it to the above shell script (e.g. in ~/bin).

Note that the script uses sudo to launch toshset, so you will either need to change this to invoke an GUI sudo program like gtksu, or add a line like the following to /etc/sudoers:

$USER ALL=(root) NOPASSWD: /usr/local/bin/toshset

where $USER is your username. Note that this allows any toshset command to be run as root without a password from this user's session, which could be exploited if someone were sufficiently motivated.