Physionet has a number of databases that consist of .dat signal-capture files in their "format212" format.
The signal (5) manpage provides this following less-than-lucid explanation:
Each sample is represented by a 12-bit two's complement amplitude. The first sample is obtained from the 12 least significant bits of the first byte pair (stored least significant byte first). The second sample is formed from the 4 remaining bits of the first byte pair (which are the 4 high bits of the 12-bit sample) and the next byte (which contains the remaining 8 bits of the second sample). The process is repeated for each successive pair of sample.
Why write a line of code when 100 words of plaintext will suffice?
Basically, 3 bytes (A, B, C) encode 2 data points (x, y) as follows:
x = ((B & 0x0F) << 8) | A
y = ((B & 0xF0) << 4) | C
A throwaway Ruby script to convert a "format212" file to an array of data-pairs:
#!/usr/bin/env ruby
def fmt212_to_a(buf)
buf.bytes.each_slice(3).collect do |data|
a = ((data[1] & 0x0F) << 8) | data[0]
b = ((data[1] & 0xF0) << 4) | data[2]
[a,b]
end
end
if __FILE__ == $0
ARGV.each do |path|
File.open(path, 'rb') do |f|
fmt212_to_a(f.read).each_with_index { |(a,b),x|
# generate gnuplot-friendly output
puts x.to_s + "\t" + a.to_s + "\t" + b.to_s
}
end
end
end
Some equally throwaway gnuplot code to plot an output file signal.dat:
# plot entire file
plot 'signal.dat' using 1:2 title 'lead 1' with lines, 'signal.dat' using 1:3 title 'lead 2' with lines
# plot first 1024 data points of potentially huge files
plot '< head -1024 signal.dat' using 1:2 title 'lead 1' with lines, '< head -1024 signal.dat' using 1:3 title 'lead 2' with lines
Monday, July 23, 2012
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment