by Travis Goodspeed <travis at radiantmachines.com>
in collaboration with Fabienne Serriere and Arjan Scherpenisse,
at Mediamatic's RFID Devcamp 2010, Amsterdam,
for Multithreaded Banjo Dinosaur Knitting Adventure 2D Extreme,
with kind thanks to David Carne.
Thanks to some extra-neighborly blackmail by Fabienne, I spent last week hacking up a storm in Amsterdam. By extending the work of Steven Conklin, Limor Fried, and Becky Stern, we were able to hack a Brother KH930 knitting machine to print high score panes from a custom video game in yarn. This game was then displayed at Mediamatic for SensorFest, leading all sorts of neighbors to learn that RFID, socializing, and beer can lead to a neighborly time. Never in my life did I expect to get such an adrenaline rush from knitting, much less from from this newfangled social networking nonsense.
Conklin's technique for loading new patterns into the machine involves using an FTDI chip to simulate a Tandy PDD1 floppy disk drive, then typing "CE 5 5 1 STEP 1 STEP CE STEP STEP CE 9 0 3 STEP STEP" to load a new pattern set from disk and print pattern 3 of the set. (551 is the command to read from disk, and patterns held in RAM begin at 900.) For our exhibit, this sequence was far too complicated to type for every half meter of output, so we hacked the device's keypad for scripting.
This article describes how a matrix keypad works, how we reverse engineered the specific keypad of the Brother KH930, and how to use an Arduino with a handful of transistors to automate the typing of commands necessary for loading a new pattern from an emulated floppy disk. It should be applicable to all sorts of keyboards, but for those with serial protocols there might be a simpler method.
Unfortunately, there won't be room here to describe the first emergency knitting machine purchase in history, our A-Team trip to the far side of Holland in a borrowed van, or the case of Club Mate that we begged from a squatter bar in order to finish the project in the five days allotted.
A keypad matrix is most often built with switches that connect a row wire to a column wire. By keeping every column in a high-impedance state with pull-up resistors, then dropping each row low in sequence, the column inputs can be sensed to determine when a button is pressed. Dave's Hacks' first article on IM ME Hacking describes in detail how the keypad of the GirlTech IMME works, and all other implementations seem to function similarly.
The connections between rows and columns are just switches.
While in a perfect world, rows and columns would be arranged exactly as they appear visually, this is rarely efficient to route in copper for more than twelve buttons. We produced the diagram below by scanning the keyboard membrane, then pushing a button while using the continuity tester to see which row and column wire were connected. The table at the bottom converts from wire signal names to the Arduino signal names used within the client library.
In order to connect rows and columns to a center rail, I used BC547 NPN transistors prototyped on an Arduino shield. Each has a 1K resistor on the base. Row transistors have the collector coming from the row signal and the emitter going to the common rail; column transistors are the same, except that the collector and emitter are swapped in order to fit the direction of current.
As each transistor conducts when only when its base is high, the Arduino can press a key by first dropping all outputs low, then raising a row control line (RCx) and a column control line (CCx) high. Because of debouncing and a limited scan rate, this must be held for a short amount of time before releasing the key.
Such a design fits perfectly well on a prototyping shield, although the transistors are a tight fit. I recommend first prototyping a single row and column to ensure that the transistors are properly aligned, as current direction might be different on your keypad.
For those who prefer to have a board fabbed, here's a proper schematic and layout. Gerbers and Eagle CAD source files are available in the project's subversion repository.
A more sophisticated software implementation would involve using the Arduino's serial port to communicate with the host, accepting strings in ASCII and translating them to keypresses. We decided against such a complication because we feared relying on a second long USB cable. Further, we thought it handy when necessary to be able to run the device stand-alone with a single (reset) button press from the operator signaling that the next pattern should be loaded.
Full keyboard emulator code is available either by pastebin or from SVN. You'll find it in banjo/code/arduino.
svn checkout svn://svn.mediamatic.nl/devcamps/camp10/banjo
The code works by using simple methods to press a row and column by raising those pin voltages. Higher level methods allow for such functions as loading a pattern from disk or printing a pattern, each by performing multiple key presses and occasionally inserting a delay. By placing this within the Arduino code's setup() function, the pattern is loaded whenever the device is reset.
New patterns are loaded from the Banjo Dinosaur Knitting Adventure 2D Extreme game by first starting a floppy disk emulator--as per Steve Conklin's technique--then pulsing the DTR line of the arduino in order to cause a reset. The Perl for that is just a single line,
Device::SerialPort->new("/dev/ttyACM0")->pulse_dtr_on(100)
The end result, with a machine typing by itself, looks a little like this,
That's all there is to it. With fourteen transistors and just as many resistors, you can script your knitting machine--or any other keypad device--from a microcontroller without modifying the underlying firmware.
As a final note, I will give a cookie to the first neighbor who uses this technique to dump all of the power codes from a universal infrared remote control, or to program something long and sophisticated into a graphing calculator that lacks a link port.
Monday, December 6, 2010
Monday, November 1, 2010
Bitmapped Sprites on the GirlTech IMME
by Travis Goodspeed <travis at radiantmachines.com>
with sprites by Eli Skipp,
extending Dave's LCD reversing,
and with thanks to Mike Ossmann.
The GirlTech IMME is a fine platform for radio hacking and embedded programming, but the LCD of the device is by no means designed for lightning fast graphics. In this brief article, I demonstrate the method by which a sprite library can be constructed which abstracts away the less neighborly minutia of the LCD in favor of a row-wise framebuffer that gets flushed to the LCD as appropriate. This isn't fast enough for a port of Sonic the Hedgehog, but it's certainly sufficient for ZombieGotcha, a network disease simulation game that Eli Skipp and I are prototyping.
To quickly recap, the GirlTech IMME is a children's toy powered by the CC1110F32, which combines an 8051 microcontroller with a versatile sub-GHz radio. The toy also includes a text LCD controller and keyboard, making it a delightful platform for embedded systems hacking and prototyping. Thanks to Dave's article on attaching a debugger and reversing the LCD, it is possible to wire a GoodFET into the IMME, allowing for debugging and reprogramming of the device.
Since the days of the first raster displays, computers have been drawing images as rows, because that's the way that a television's electron stream is traced along the display. The IMME draws rows internally, but because it is intended to draw fonts, its rows are eight pixels tall. For example, the bytes {0x7f, 0x08, 0x08, 0x08, 0x7f, 0x00} are pushed to the LCD in order to draw an uppercase letter H in Dave's LCD Demo for the IMME. Drawing this on graph paper or rendering it by Python, it becomes clear that 0x7F represents a side of the letter, 0x08 represents the bar, and 0x00 is the space following the letter.
Rather than attempting to natively store a framebuffer in the LCD's format, I chose to internally index by row, then to translate during the export of the framebuffer to the LCD. My primary reason for doing this is to maintain portability of the ZombieGotcha code to custom hardware. It also simplifies the transcription of sprites from original bitmaps to C structures.
Having an internal framebuffer is not without cost, however. The buffer is 132 pixels wide and 64 pixels tall. Even with bit-packing, this is more than a kilobyte in size, consuming more than a quarter of the CC1110's 4kB of XDATA RAM. (Programs and sprites are stored in CODE Flash memory, of which there is 32kB, so this limit is not quite so severe as it sounds.)
My sprite toolchain is begun with bitmap images submitted by the artist. I cannot stress enough how important it is that your pixel artist understand pixels. Every sprite must be drawn at native resolution, with proper offsets of each frame and an understanding that the LCD is what it is, regardless of what Photoshop might render.
From the original bitmap sprites, PNM files are produced that are more easily parsed by Perl. This format consists of whitespace-delimited words for the format, width, and height of the sprite. In this case, the sprite consists of three frames, each of them being 24 pixels width and 32 pixels tall. Other sizes are possible, of course, but it is convenient for bit packing that the width is an even multiple of 8, so that the old pixels needn't be read back in.
It is then necessary to use an ugly Perl script to convert the sprite from a PNM to a C array of bytes. Pretty Perl won't work, of course, because such a thing doesn't exist. My script, sprite2c.pl, takes a 24-bit color PNM file and converts it to a 1-bit map that is byte-packed. Each of these is included by the preprocessor as a __code unsigned char[], with the __code keyword implying that the object should be stored in Flash rather than RAM.
Sprite animations are performed by storing the frame count along with the width and height in the C structure. Considering the resource constraints of this system, frame counts change very rarely and are stored within the C code.
Dumping the frame-buffer to the LCD is as simple as writing every stripe of data to the screen. Potentially, as an optimization, you could keep track of which stripes have been invalidated across what horizontal range, updating only where necessary.
Keeping in mind that the LCD is updated as stripes of 8 pixels in height, code such as the following will refresh the entire LCD from the frame-buffer. Be sure not to erase the LCD between writes, as that would cause unnecessary flickering.
The framebuffer elements themselves are grabbed by selecting the pixel index divided by 8, then masking off the selected bit. The following functions work admirably for this purpose.
If greater performance is required, a game should certainly be designed with a custom graphics library that optimizes the few things it does most. Performances can be gained by storing sprites and the frame-buffer natively in the row/stripe format that the LCD expects, simplifying conversion. Mike Ossmann used the technique of basing most graphics around vertical lines in his spectrum analyzer firmware, allowing for channels to be redrawned as they are scanned rather than flushed from a frame-buffer.
As the ZombieGotcha game is largely turn-based and the frame-rate is not a dire concern, I don't expect to optimize the sprite library much beyond what is presented here.
One major addition will be that of screenshots, as I'd like to produce animated GIFs of game action for the website. Screenshots can be produced by the method that I outline in CC1110 Instrumentation with Python, dumping the frame-buffer from XDATA with a GoodFET and writing that to disk. Alternatively, full-speed screenshots could be dumped by sniffing the LCD's SPI bus using a Total Phase Beagle or other SPI protocol analyzer.
For those of you with an IMME, Eli and I will be releasing the ZombieGotcha game at the Twenty-Seventh Chaos Communications Congress in Berlin this winter. We'll bring a bed of nails for reflashing IMME units on the spot, as well as GoodFET kits for modifying your IMME to be a development kit. (The ZombieGotcha will also run on full-custom hardware, but we are maintaining IMME support in parallel.)
As a final note, a teaser of the ZombieGotcha opening screen can be found here in the Intel-Hex format. I'll give a GoodFET40 kit to the first neighbor who sends me an animated GIF of it and a script to generate the same, either by a debugger or an emulator.
with sprites by Eli Skipp,
extending Dave's LCD reversing,
and with thanks to Mike Ossmann.
The GirlTech IMME is a fine platform for radio hacking and embedded programming, but the LCD of the device is by no means designed for lightning fast graphics. In this brief article, I demonstrate the method by which a sprite library can be constructed which abstracts away the less neighborly minutia of the LCD in favor of a row-wise framebuffer that gets flushed to the LCD as appropriate. This isn't fast enough for a port of Sonic the Hedgehog, but it's certainly sufficient for ZombieGotcha, a network disease simulation game that Eli Skipp and I are prototyping.
To quickly recap, the GirlTech IMME is a children's toy powered by the CC1110F32, which combines an 8051 microcontroller with a versatile sub-GHz radio. The toy also includes a text LCD controller and keyboard, making it a delightful platform for embedded systems hacking and prototyping. Thanks to Dave's article on attaching a debugger and reversing the LCD, it is possible to wire a GoodFET into the IMME, allowing for debugging and reprogramming of the device.
Since the days of the first raster displays, computers have been drawing images as rows, because that's the way that a television's electron stream is traced along the display. The IMME draws rows internally, but because it is intended to draw fonts, its rows are eight pixels tall. For example, the bytes {0x7f, 0x08, 0x08, 0x08, 0x7f, 0x00} are pushed to the LCD in order to draw an uppercase letter H in Dave's LCD Demo for the IMME. Drawing this on graph paper or rendering it by Python, it becomes clear that 0x7F represents a side of the letter, 0x08 represents the bar, and 0x00 is the space following the letter.
Rather than attempting to natively store a framebuffer in the LCD's format, I chose to internally index by row, then to translate during the export of the framebuffer to the LCD. My primary reason for doing this is to maintain portability of the ZombieGotcha code to custom hardware. It also simplifies the transcription of sprites from original bitmaps to C structures.
Having an internal framebuffer is not without cost, however. The buffer is 132 pixels wide and 64 pixels tall. Even with bit-packing, this is more than a kilobyte in size, consuming more than a quarter of the CC1110's 4kB of XDATA RAM. (Programs and sprites are stored in CODE Flash memory, of which there is 32kB, so this limit is not quite so severe as it sounds.)
My sprite toolchain is begun with bitmap images submitted by the artist. I cannot stress enough how important it is that your pixel artist understand pixels. Every sprite must be drawn at native resolution, with proper offsets of each frame and an understanding that the LCD is what it is, regardless of what Photoshop might render.
From the original bitmap sprites, PNM files are produced that are more easily parsed by Perl. This format consists of whitespace-delimited words for the format, width, and height of the sprite. In this case, the sprite consists of three frames, each of them being 24 pixels width and 32 pixels tall. Other sizes are possible, of course, but it is convenient for bit packing that the width is an even multiple of 8, so that the old pixels needn't be read back in.
It is then necessary to use an ugly Perl script to convert the sprite from a PNM to a C array of bytes. Pretty Perl won't work, of course, because such a thing doesn't exist. My script, sprite2c.pl, takes a 24-bit color PNM file and converts it to a 1-bit map that is byte-packed. Each of these is included by the preprocessor as a __code unsigned char[], with the __code keyword implying that the object should be stored in Flash rather than RAM.
Sprite animations are performed by storing the frame count along with the width and height in the C structure. Considering the resource constraints of this system, frame counts change very rarely and are stored within the C code.
Dumping the frame-buffer to the LCD is as simple as writing every stripe of data to the screen. Potentially, as an optimization, you could keep track of which stripes have been invalidated across what horizontal range, updating only where necessary.
Keeping in mind that the LCD is updated as stripes of 8 pixels in height, code such as the following will refresh the entire LCD from the frame-buffer. Be sure not to erase the LCD between writes, as that would cause unnecessary flickering.
The framebuffer elements themselves are grabbed by selecting the pixel index divided by 8, then masking off the selected bit. The following functions work admirably for this purpose.
If greater performance is required, a game should certainly be designed with a custom graphics library that optimizes the few things it does most. Performances can be gained by storing sprites and the frame-buffer natively in the row/stripe format that the LCD expects, simplifying conversion. Mike Ossmann used the technique of basing most graphics around vertical lines in his spectrum analyzer firmware, allowing for channels to be redrawned as they are scanned rather than flushed from a frame-buffer.
As the ZombieGotcha game is largely turn-based and the frame-rate is not a dire concern, I don't expect to optimize the sprite library much beyond what is presented here.
One major addition will be that of screenshots, as I'd like to produce animated GIFs of game action for the website. Screenshots can be produced by the method that I outline in CC1110 Instrumentation with Python, dumping the frame-buffer from XDATA with a GoodFET and writing that to disk. Alternatively, full-speed screenshots could be dumped by sniffing the LCD's SPI bus using a Total Phase Beagle or other SPI protocol analyzer.
For those of you with an IMME, Eli and I will be releasing the ZombieGotcha game at the Twenty-Seventh Chaos Communications Congress in Berlin this winter. We'll bring a bed of nails for reflashing IMME units on the spot, as well as GoodFET kits for modifying your IMME to be a development kit. (The ZombieGotcha will also run on full-custom hardware, but we are maintaining IMME support in parallel.)
As a final note, a teaser of the ZombieGotcha opening screen can be found here in the Intel-Hex format. I'll give a GoodFET40 kit to the first neighbor who sends me an animated GIF of it and a script to generate the same, either by a debugger or an emulator.
Tuesday, October 5, 2010
CC1110 Instrumentation with Python
by Travis Goodspeed <travis at radiantmachines.com>
concerning the GirlTech IMME,
utilizing the GoodFET's Chipcon Debugger
to instrument Michael Ossmann's $16 Pocket Spectrum Analyzer.
Often a neighbor, such as myself, finds himself with a damned useful tool that's missing logging. Further, when this is a black-box application, it is undeniably inconvenient to patch in logging where no communications method exists. In this brief article, I will demonstrate how to instrument a Chipcon CC1110 application using Python and a GoodFET with zero bytes of modification to the original firmware image. My target's source code is available, but the technique applies just as well to black-box firmware.
Specifically, I want to dump the raw data from Michael Ossmann's $16 Pocket Spectrum Analyzer in order to demo it on stage at our Toorcon 12 talk, Real Men Carry Pink Pagers. I'll assume the reader to be familiar with Mike's article, as well as my article on wiring an IMME for debugging with a GoodFET. Readers wishing to play along can order a GoodFET31 for little or no money by following these instructions.
From the datasheet or Mike's Makefile it can be seen that the external RAM (XDATA, which is called external for historic reasons) section begins at 0xF000. The only structure at this location is xdata channel_info chan_table[NUM_CHANNELS], where NUM_CHANNELS is defined to be 132 and the channel_info struct is eight bytes, defined as
The first several entries in this struct might be as follows. The first of these lines defines the frequency to be 0x22af4b, the frequency calibration to be 0xef2c27, and the signal strength to be 0x41. The final byte defines the maximum strength, which is zero unless max-highlighting mode is turned on.
This information can be extracted by halting and resuming the CPU, just as I did in my article on the PRNG Vulnerability of Z-Stack's ZigBee SEP ECC implementation. That is, GoodFETCC.CCpeekdatabyte() is used to grab these bytes from the CC1110's RAM. The following code dumped the fragment shown above.
To get proper data, it is necessary to add a wake-up delay of a few seconds, so that the table is populated by the first sampling. Additionally, it is handy to convert the frequency to MHz from its native unit (Hz/396.728515625). Finally, successive pages should be separated by a time column in order to facilitate graphing. The result is this script, which can be found in the GoodFET project as "contrib/ccspecantap/specantap.py".
An example recording can be seen below, as a spectrum recording of a friend's Kenwood FreeTalk UBZ-AL14 FM Transceiver unit. This is compatible with other family FM radios, and I wanted to know which center frequencies were in use. In the first image, I've highlighted the noise floor as blue and the peaks as green. In the second image, the time domain is used to show broadcasts on several channels in sequence, all of which fall into one of the two center frequencies: 925MHz and 935MHz. The raw data is here, best viewed with NASA's excellent Viewpoints tool.
Future additions to this project will include integration into the GoodFET radio framework, as well as a 2.4GHz version built around the CC2430 and CC2530. The performance of the tap can be greatly improved by using block transactions rather than peeking individual bytes, and the view can be re-tuned by poking the configuration IDATA variables, positions of which are described in specan.rst as produced by the SDCC compiler.
It's also handy to know that a Chipcon radio will halt if the 0xA5 opcode is executed. In this manner, patching a single byte of a while() loop can allow the state at every iteration to be dumped.
This technique of scripted debugging through a programmer can be handy for more than instrumentation. Unit tests can be run on an assembly line without additional wiring, and--by single-stepping--execution traces of unknown firmware can be generated for assisting in reverse engineering. Those with enough patience can use this to fuzz test embedded systems for security vulnerabilities.
For help instrumenting your own Chipcon or MSP430 project, join us in #goodfet on irc.freenode.net or send me an email. The project has advanced to the point where interesting things can be done from Python alone, and I'd quite like to see what could be done with it.
concerning the GirlTech IMME,
utilizing the GoodFET's Chipcon Debugger
to instrument Michael Ossmann's $16 Pocket Spectrum Analyzer.
Often a neighbor, such as myself, finds himself with a damned useful tool that's missing logging. Further, when this is a black-box application, it is undeniably inconvenient to patch in logging where no communications method exists. In this brief article, I will demonstrate how to instrument a Chipcon CC1110 application using Python and a GoodFET with zero bytes of modification to the original firmware image. My target's source code is available, but the technique applies just as well to black-box firmware.
Specifically, I want to dump the raw data from Michael Ossmann's $16 Pocket Spectrum Analyzer in order to demo it on stage at our Toorcon 12 talk, Real Men Carry Pink Pagers. I'll assume the reader to be familiar with Mike's article, as well as my article on wiring an IMME for debugging with a GoodFET. Readers wishing to play along can order a GoodFET31 for little or no money by following these instructions.
From the datasheet or Mike's Makefile it can be seen that the external RAM (XDATA, which is called external for historic reasons) section begins at 0xF000. The only structure at this location is xdata channel_info chan_table[NUM_CHANNELS], where NUM_CHANNELS is defined to be 132 and the channel_info struct is eight bytes, defined as
typedef struct {
/* frequency setting */
u8 freq2;
u8 freq1;
u8 freq0;
/* frequency calibration */
u8 fscal3;
u8 fscal2;
u8 fscal1;
/* signal strength */
u8 ss;
u8 max;
} channel_info;
The first several entries in this struct might be as follows. The first of these lines defines the frequency to be 0x22af4b, the frequency calibration to be 0xef2c27, and the signal strength to be 0x41. The final byte defines the maximum strength, which is zero unless max-highlighting mode is turned on.
22 af 4b ef 2c 27 41 00
22 b1 43 ef 2c 27 3e 00
22 b3 3b ef 2c 27 46 00
22 b5 33 ef 2c 27 3c 00
22 b7 2b ef 2c 27 44 00
22 b9 23 ef 2c 28 3c 00
22 bb 1b ef 2c 28 42 00
22 bd 13 ef 2c 28 3f 00
22 bf 0b ef 2c 28 3d 00
22 c1 03 ef 2c 28 40 00
22 c2 fb ef 2c 28 3d 00
22 c4 f3 ef 2c 28 3e 00
...
This information can be extracted by halting and resuming the CPU, just as I did in my article on the PRNG Vulnerability of Z-Stack's ZigBee SEP ECC implementation. That is, GoodFETCC.CCpeekdatabyte() is used to grab these bytes from the CC1110's RAM. The following code dumped the fragment shown above.
#!/usr/bin/env python
import sys;
sys.path.append('/Users/travis/svn/goodfet/trunk/client/')
from GoodFETCC import GoodFETCC;
from intelhex import IntelHex16bit, IntelHex;
import time;
client=GoodFETCC();
client.serInit();
client.setup();
client.start();
bytescount=8*132;
bytestart=0xf000;
while 1:
time.sleep(1);
client.CChaltcpu();
dump="";
for foo in range(0,bytescount):
dump=("%s %02x" % (dump,client.CCpeekdatabyte(bytestart+foo)));
if foo%8==7: dump=dump+"\n";
print dump;
sys.stdout.flush();
client.CCreleasecpu();
To get proper data, it is necessary to add a wake-up delay of a few seconds, so that the table is populated by the first sampling. Additionally, it is handy to convert the frequency to MHz from its native unit (Hz/396.728515625). Finally, successive pages should be separated by a time column in order to facilitate graphing. The result is this script, which can be found in the GoodFET project as "contrib/ccspecantap/specantap.py".
An example recording can be seen below, as a spectrum recording of a friend's Kenwood FreeTalk UBZ-AL14 FM Transceiver unit. This is compatible with other family FM radios, and I wanted to know which center frequencies were in use. In the first image, I've highlighted the noise floor as blue and the peaks as green. In the second image, the time domain is used to show broadcasts on several channels in sequence, all of which fall into one of the two center frequencies: 925MHz and 935MHz. The raw data is here, best viewed with NASA's excellent Viewpoints tool.
Future additions to this project will include integration into the GoodFET radio framework, as well as a 2.4GHz version built around the CC2430 and CC2530. The performance of the tap can be greatly improved by using block transactions rather than peeking individual bytes, and the view can be re-tuned by poking the configuration IDATA variables, positions of which are described in specan.rst as produced by the SDCC compiler.
It's also handy to know that a Chipcon radio will halt if the 0xA5 opcode is executed. In this manner, patching a single byte of a while() loop can allow the state at every iteration to be dumped.
This technique of scripted debugging through a programmer can be handy for more than instrumentation. Unit tests can be run on an assembly line without additional wiring, and--by single-stepping--execution traces of unknown firmware can be generated for assisting in reverse engineering. Those with enough patience can use this to fuzz test embedded systems for security vulnerabilities.
For help instrumenting your own Chipcon or MSP430 project, join us in #goodfet on irc.freenode.net or send me an email. The project has advanced to the point where interesting things can be done from Python alone, and I'd quite like to see what could be done with it.
Sunday, July 4, 2010
Reversing an RF Clicker
by Travis Goodspeed <travis at radiantmachines.com>
concerning the Turning Point ResponseCard RF,
having FCC ID R4WRCRF01,
patented as USA 7,330,716.
In this article, I describe in detail the methods by which I have reverse engineered the TurningPoint ResponseCard RF, casually known among students as a "Clicker". This 2.4GHz radio transceiver is used in undergraduate university classrooms for automated roll-call and in-class quizzing or voting. By dumping and analyzing its firmware, one can determine the radio protocol necessary to intercept and forge packets, as well as to build a custom base station. The radio hardware that I have used is a reprogrammed Next HOPE Badge running the GoodFET firmware.
A follow-up article will likely describe the writing of replacement firmware, but that can be easily enough discovered by an enterprising reader. My purpose instead is to provide the information necessary to build compatible products, as well as to teach the technique of reverse engineering these products to find such information when none is available.
The Clicker's keypad is attached only with adhesive, and it can be pulled off after lifting an edge with a knife blade. Beneath the keypad, there are four screws holding the board in place, plus a fifth from the rear of the device. If you are lucky, these will be small Phillips screws, but the unlucky will find tri-wing "Nintendo" screws. I was lucky to have one of each type, but those with neither a Phillips-screwed Clicker nor a tri-wing screwdriver can buy one or try one of these tricks.
In either case, it isn't strictly necessary to open your clicker, as test-points for dumping and replacing its firmware are accessible from the battery compartment. Further, the radio communications are accessible with no hardware access whatsoever.
The Clicker is built upon a Nordic nRF24E1 chip, which combines an 8051 microcontroller with an nRF2401 radio transceiver. Although the two cores have been combined into a single package, the 8051 core speaks to the radio through a few bit-field registers and an internal SPI bus, which is shared with the external SPI bus.
As the nRF24E1 lacks internal non-volatile storage, a CAT25C32 (pdf) SPI EEPROM is used for program and configuration storage. Within the microcontroller, there is a masked ROM bootloader from 8000h to 81FFh that loads executable code from the EEPROM into executable RAM from 0000h to 0FFFh.
At the base of the circuit board's primary side, there are test points for the SPI EEPROM. As the default firmware only uses the SPI bus when buttons are pressed, this EEPROM may be dumped at any point after the device has booted. The test points are as follows, which should be matched to those of equivalent names in the GoodFET SPI Table. They were determined by use of a continuity tester.
In order to dump the firmware, I quickly wrote a GoodFET client for the 25C32 using its datasheet. A read is performed by sending {0x02, AL, AH, 0} as a SPI transaction, with the result coming back as the fourth byte. Doing it this way with the GoodFET's SPI driver is slower than having C code within the GoodFET dump the whole ROM, but it's fast enough for a dump and takes very little code.
From this point, I dumped the firmware with 'goodfet.spi25c dump image.hex', converted the Intel Hex file to binary, and popped it open in Emacs/hexl. The result looks something like the following, whose format is described in the nRF24E1 datasheet. The opening passage is {u8 config, u8 entry offset, u8 blockcount}. Here {0x0B, 0x07, 0x0B} means that executable code begins at byte 0x07, and that the total image length is 0x0B*256==2,816 bytes. (Additional space within the SPI ROM is unused and left as 0xFF.)
To produce an image suitable for a disassembler, I cut the bytes before 0x07 to make an image beginning with {0x02, 0x0A, 0xB7, ...}. The extra bytes in this region are the serial number and default frequency, but we'll get back to that later.
As the firmware is only three kilobytes, it doesn't take terribly long to reverse engineer. First, the Special Function Registers (SFR) which are defined on pages 79 and 81 of the nRF24E1 datasheet are fed to the disassembler.
(I'm using IDA Pro here, but any 8051 disassembler with a decent text editor could suffice. All of the following function labels are from my imagination, while Special Function Registers (SFRs) come from the nRF24E1 datasheet.)
For example, "MOV 0xA0, #0x80" is rather opaque, but "MOV RADIO, #0x80" makes it clear that the immediate value 0x80 is being placed into the RADIO register. Page 89 of the datasheet will then explain that the high bit of the radio register is power control, so this instruction is powering up the radio for use. Similarly, "SETB RADIO.3" is setting the fourth bit of the RADIO register, which the datasheet describes as raising the CS signal.
Once the SFR addresses are known, it becomes useful to search for them in order to identify the I/O routines. In the nRF24E1, the radio is accessed across a SPI bus, so a good first step is to identify the SPI routine. The function containing this code will always include a MOV involving the SPI_DATA register.
Having this, a list of cross-references quickly shows that while few functions call the SPIRXTX function, each calls it many times. This is because the author has chosen to repeatedly call that function with immediate values, rather than to dump an array of bytes with a for(){} loop.
While the disassembler can automatically identify the function entry points in the table above, it is not capable of giving them English names or descriptions. To understand how this is done, it is necessary to read the datasheets of the SPI devices.
The SPI EEPROM chip, a CAT25C32, is used by dropping the !CS line then writing an opcode byte followed by its parameters or results. Opcodes include WREN/WRDI for write protection, RDSR/WRSR for accessing a status register, and READ/WRITE for reading and writing bytes. A WRITE may only be performed when the external !WP pin is low and the software write protect has been disabled by opcode. A transaction begins when !CS drops low and ends when it drops high.
To identify the function which reads a byte from the 25C32, a few things can be safely assumed: (1) The function will begin by dropping some I/O pin (!CS). (2) The function will then broadcast the READ opcode, 0x03. (3) It will then broadcast a sixteen bit parameter; that is, DPL followed by DPH. (4) Finally, it will return the result of a fourth SPIRXTX call. In pseudocode, that would be something like
Sure enough, one of the few functions calling SPIRXTX does exactly this. The constant pushing and popping of the parameters is a quirk of the compiler, which might possibly allow it to be identified. From the code below, it is clear that P0.0 is the !CS line of the CAT25C32.
The SPIROMPOKE function looks similar, except that two transactions are performed. First the WREN (0x06) opcode is sent to enable writing, then WRITE (0x02) is used to perform the actual write.
The other SPI operations concern the nRF2401 radio core, which behaves differently from the EEPROM. Rather than transactions being an opcode followed by parameters, there is only a single SPI register that must be completely written during a transaction. A second register, selected by the CE line, contains the packets.
The configuration is set by one big register, sent MSBit first. If fewer than the needed bytes are sent, the value is right-aligned into the lower bytes of the register. That is, the last byte sent is always (CHAN<<1)|RXMODE and the second to last always describes the radio configuration.
Searching around a bit yields the RADIOWRCONFIG function, the tail of which is below. It can be seen from the code that the 0x1A IRAM byte holds the channel number. That is, if 0x20 is stored at 0x1A, the radio will be configured to 2,432 MHz. The other configuration bytes reveal that the MAC addresses are 24 bits, the checksum is 16 bits, and the device broadcasts at maximum power sourced from a 16MHz crystal. (That the configured crystal is identical to the one on the board is very important. Some enterprising coders will lie to a chip about its crystal in order to access an unsupported radio frequency.)
At this point, it still remains to sniff traffic is to find the target address to which packets are broadcast as well as the frequency. We'll start with the address, because that's a bit easier.
The TXPACKET function involves a lot of PUSH and POP instructions, but it otherwise looks very similar to the RADIOWRCONFIG function, in that a series of bytes are written in order with repeated function calls to SPIRXTX. In pseudocode, this function becomes the following. From the radio documentation and configuration, it is clear that the first three bytes will be the target MAC address. From the RADIOWRCONFIG() function, it is equally clear that the three bytes at 0x1B are the receiving MAC address of the unit. (The parameter of the function happens to be the button press, as can be determined by tracking the keyboard I/O routines or viewing a few packets.)
The radio itself will append a 16-bit CRC; therefore, the full packet then becomes {u24 tmac, u24 smac, u8 button}.
To determine the value of the target MAC address, just grep the disassembly for "mov" and one of 0x1E, 0x1F, 0x20. The relevant instructions are as follows, setting the target MAC address to 0x123456. (In 8051 notation, the first instruction moves the immediate constant #0x12 into byte 0x1E of IRAM.)
As this point, it would be possible to scan each channel for a few seconds, listening for packets sent to that address, but it's classier to find the value by static analysis. Acting on the hunch that the configuration is held in EEPROM and looking for references to the SPIROMPEEK() function, the READIDFREQ() function can be found. As can be seen in the fragment below, EEPROM[6] holds the channel number while the MAC address is at EEPROM[3,4,5].
As the EEPROM begins with "0b 07 0b 15 79 1b 29", it's clear that the MAC address of the unit from which it came is 0x15791B and that it is broadcasting on 2400+0x29=2441MHz. This can be double-checked by the serial number "15791B" being printed on the label.
Knowing the modulation scheme, target address, and packet contents, it becomes possible to sniff traffic from a Clicker. This is performed by use of the GoodFET firmware on a Next Hope badge, my prior tutorial for which describes the process of packet sniffing.
The NHBadge board contains an nRF24L01+ radio, which differs dramatically from the nRF2401 in terms of how it is configured. Still, the radios are sufficiently compatible. The following hack of the goodfet.nrf client allows packets to be sniffed from the air with proper checksumming.
Sure enough, here are some packets of the 5 button being pressed on unit 1F8760. The keypress is the final byte in ASCII.
Now that it is clear how to receive and recognize button presses, it becomes necessary to reverse engineer the response codes which might be sent from the access point. Without hearing a reply of at least an ACK, the Clicker will continue to broadcast each message more than three hundred times. This takes more than ten seconds, during which all other key presses are ignored.
The broadcast loop within the MAIN() function would look a little like this in C.
This region is easy enough to find, but there's another command mode. An easier target is the channel hopping routine, which constantly broadcasts 0x3F while incrementing the channel, sticking with the last one on which a reply of 0x18 was received. Channels 1 through 83 are attempted; that is, 2,401 MHz to 2,483MHz at 1MHz steps.
Checking this code within the MAIN() function reveals that its effect is to blink the green LED (P1.1) six times, exiting the broadcast loop. Other commands include 0x04 (LED Off), 0x06 (LED Green), 0x15 (LED Red), 0x11 (Blink Green), 0x14 (Blink Red), and 0x18 (Blink Green, Channel Lock). All undefined opcodes set the red LED.
By sniffing traffic within a classroom, it is possible to watch votes as they are being cast by students. Similarly, packets could be broadcast by a reprogrammed Clicker or NHBadge to make a student in virtual attendance, automatically voting with the majority so as to gain perfect attendance and a solid C quiz average. Where instant feedback is available, this might even allow for a solid A quiz average. Without taking advantage of the masked-ROM option of the nRF24E1, the code cannot be even slightly protected from extraction and reverse engineering.
Less adventurous users can jam the network by running 'goodfet.nrf carrier 2441000000' to hold a carrier wave on the channel. The only attempt at a frequency change is made when pressing the GO button, at which point the new channel can be discovered and similarly jammed.
Since performing this work, it has come to my attention that a USRP plugin for doing this to the competing 900MHz iClicker product is available as http://gr-clicker.sourceforge.net/. Additionally, the infrared Clicker units were broken with a little tool called Survey Says. I have ordered more sophisticated Clicker models from CPS and Turning Point, and proper descriptions of them will soon follow.
concerning the Turning Point ResponseCard RF,
having FCC ID R4WRCRF01,
patented as USA 7,330,716.
In this article, I describe in detail the methods by which I have reverse engineered the TurningPoint ResponseCard RF, casually known among students as a "Clicker". This 2.4GHz radio transceiver is used in undergraduate university classrooms for automated roll-call and in-class quizzing or voting. By dumping and analyzing its firmware, one can determine the radio protocol necessary to intercept and forge packets, as well as to build a custom base station. The radio hardware that I have used is a reprogrammed Next HOPE Badge running the GoodFET firmware.
A follow-up article will likely describe the writing of replacement firmware, but that can be easily enough discovered by an enterprising reader. My purpose instead is to provide the information necessary to build compatible products, as well as to teach the technique of reverse engineering these products to find such information when none is available.
Disassembly
The Clicker's keypad is attached only with adhesive, and it can be pulled off after lifting an edge with a knife blade. Beneath the keypad, there are four screws holding the board in place, plus a fifth from the rear of the device. If you are lucky, these will be small Phillips screws, but the unlucky will find tri-wing "Nintendo" screws. I was lucky to have one of each type, but those with neither a Phillips-screwed Clicker nor a tri-wing screwdriver can buy one or try one of these tricks.
In either case, it isn't strictly necessary to open your clicker, as test-points for dumping and replacing its firmware are accessible from the battery compartment. Further, the radio communications are accessible with no hardware access whatsoever.
Hardware
The Clicker is built upon a Nordic nRF24E1 chip, which combines an 8051 microcontroller with an nRF2401 radio transceiver. Although the two cores have been combined into a single package, the 8051 core speaks to the radio through a few bit-field registers and an internal SPI bus, which is shared with the external SPI bus.
As the nRF24E1 lacks internal non-volatile storage, a CAT25C32 (pdf) SPI EEPROM is used for program and configuration storage. Within the microcontroller, there is a masked ROM bootloader from 8000h to 81FFh that loads executable code from the EEPROM into executable RAM from 0000h to 0FFFh.
Dumping Firmware
At the base of the circuit board's primary side, there are test points for the SPI EEPROM. As the default firmware only uses the SPI bus when buttons are pressed, this EEPROM may be dumped at any point after the device has booted. The test points are as follows, which should be matched to those of equivalent names in the GoodFET SPI Table. They were determined by use of a continuity tester.
T4 | MISO |
T5 | SCK |
T6 | MOSI |
T3 | !CS |
T1 | VCC |
GND | GND |
In order to dump the firmware, I quickly wrote a GoodFET client for the 25C32 using its datasheet. A read is performed by sending {0x02, AL, AH, 0} as a SPI transaction, with the result coming back as the fourth byte. Doing it this way with the GoodFET's SPI driver is slower than having C code within the GoodFET dump the whole ROM, but it's fast enough for a dump and takes very little code.
From this point, I dumped the firmware with 'goodfet.spi25c dump image.hex', converted the Intel Hex file to binary, and popped it open in Emacs/hexl. The result looks something like the following, whose format is described in the nRF24E1 datasheet. The opening passage is {u8 config, u8 entry offset, u8 blockcount}. Here {0x0B, 0x07, 0x0B} means that executable code begins at byte 0x07, and that the total image length is 0x0B*256==2,816 bytes. (Additional space within the SPI ROM is unused and left as 0xFF.)
To produce an image suitable for a disassembler, I cut the bytes before 0x07 to make an image beginning with {0x02, 0x0A, 0xB7, ...}. The extra bytes in this region are the serial number and default frequency, but we'll get back to that later.
Firmware Analysis
As the firmware is only three kilobytes, it doesn't take terribly long to reverse engineer. First, the Special Function Registers (SFR) which are defined on pages 79 and 81 of the nRF24E1 datasheet are fed to the disassembler.
(I'm using IDA Pro here, but any 8051 disassembler with a decent text editor could suffice. All of the following function labels are from my imagination, while Special Function Registers (SFRs) come from the nRF24E1 datasheet.)
For example, "MOV 0xA0, #0x80" is rather opaque, but "MOV RADIO, #0x80" makes it clear that the immediate value 0x80 is being placed into the RADIO register. Page 89 of the datasheet will then explain that the high bit of the radio register is power control, so this instruction is powering up the radio for use. Similarly, "SETB RADIO.3" is setting the fourth bit of the RADIO register, which the datasheet describes as raising the CS signal.
Once the SFR addresses are known, it becomes useful to search for them in order to identify the I/O routines. In the nRF24E1, the radio is accessed across a SPI bus, so a good first step is to identify the SPI routine. The function containing this code will always include a MOV involving the SPI_DATA register.
Having this, a list of cross-references quickly shows that while few functions call the SPIRXTX function, each calls it many times. This is because the author has chosen to repeatedly call that function with immediate values, rather than to dump an array of bytes with a for(){} loop.
While the disassembler can automatically identify the function entry points in the table above, it is not capable of giving them English names or descriptions. To understand how this is done, it is necessary to read the datasheets of the SPI devices.
The SPI EEPROM chip, a CAT25C32, is used by dropping the !CS line then writing an opcode byte followed by its parameters or results. Opcodes include WREN/WRDI for write protection, RDSR/WRSR for accessing a status register, and READ/WRITE for reading and writing bytes. A WRITE may only be performed when the external !WP pin is low and the software write protect has been disabled by opcode. A transaction begins when !CS drops low and ends when it drops high.
To identify the function which reads a byte from the 25C32, a few things can be safely assumed: (1) The function will begin by dropping some I/O pin (!CS). (2) The function will then broadcast the READ opcode, 0x03. (3) It will then broadcast a sixteen bit parameter; that is, DPL followed by DPH. (4) Finally, it will return the result of a fourth SPIRXTX call. In pseudocode, that would be something like
SPIROMPEEK(u16 ADR){
SPIRXTX(0x03);
SPIRXTX(ADRL);
SPIRXTX(ADRH);
return SPIRXTX();
}
Sure enough, one of the few functions calling SPIRXTX does exactly this. The constant pushing and popping of the parameters is a quirk of the compiler, which might possibly allow it to be identified. From the code below, it is clear that P0.0 is the !CS line of the CAT25C32.
The SPIROMPOKE function looks similar, except that two transactions are performed. First the WREN (0x06) opcode is sent to enable writing, then WRITE (0x02) is used to perform the actual write.
The other SPI operations concern the nRF2401 radio core, which behaves differently from the EEPROM. Rather than transactions being an opcode followed by parameters, there is only a single SPI register that must be completely written during a transaction. A second register, selected by the CE line, contains the packets.
The configuration is set by one big register, sent MSBit first. If fewer than the needed bytes are sent, the value is right-aligned into the lower bytes of the register. That is, the last byte sent is always (CHAN<<1)|RXMODE and the second to last always describes the radio configuration.
Searching around a bit yields the RADIOWRCONFIG function, the tail of which is below. It can be seen from the code that the 0x1A IRAM byte holds the channel number. That is, if 0x20 is stored at 0x1A, the radio will be configured to 2,432 MHz. The other configuration bytes reveal that the MAC addresses are 24 bits, the checksum is 16 bits, and the device broadcasts at maximum power sourced from a 16MHz crystal. (That the configured crystal is identical to the one on the board is very important. Some enterprising coders will lie to a chip about its crystal in order to access an unsupported radio frequency.)
At this point, it still remains to sniff traffic is to find the target address to which packets are broadcast as well as the frequency. We'll start with the address, because that's a bit easier.
The TXPACKET function involves a lot of PUSH and POP instructions, but it otherwise looks very similar to the RADIOWRCONFIG function, in that a series of bytes are written in order with repeated function calls to SPIRXTX. In pseudocode, this function becomes the following. From the radio documentation and configuration, it is clear that the first three bytes will be the target MAC address. From the RADIOWRCONFIG() function, it is equally clear that the three bytes at 0x1B are the receiving MAC address of the unit. (The parameter of the function happens to be the button press, as can be determined by tracking the keyboard I/O routines or viewing a few packets.)
void TXPACKET(u8 button){
RADIOHOP(); //set channel
//Target MAC address
SPIRXTX(&0x1E);
SPIRXTX(&0x1F);
SPIRXTX(&0x20);
//Source MAC address
SPIRXTX(&0x1B);
SPIRXTX(&0x1C);
SPIRXTX(&0x1D);
//Data value
SPIRXTX(button);
}
The radio itself will append a 16-bit CRC; therefore, the full packet then becomes {u24 tmac, u24 smac, u8 button}.
To determine the value of the target MAC address, just grep the disassembly for "mov" and one of 0x1E, 0x1F, 0x20. The relevant instructions are as follows, setting the target MAC address to 0x123456. (In 8051 notation, the first instruction moves the immediate constant #0x12 into byte 0x1E of IRAM.)
mov 0x1E, #0x12
mov 0x1F, #0x34
mov 0x20, #0x56
As this point, it would be possible to scan each channel for a few seconds, listening for packets sent to that address, but it's classier to find the value by static analysis. Acting on the hunch that the configuration is held in EEPROM and looking for references to the SPIROMPEEK() function, the READIDFREQ() function can be found. As can be seen in the fragment below, EEPROM[6] holds the channel number while the MAC address is at EEPROM[3,4,5].
As the EEPROM begins with "0b 07 0b 15 79 1b 29", it's clear that the MAC address of the unit from which it came is 0x15791B and that it is broadcasting on 2400+0x29=2441MHz. This can be double-checked by the serial number "15791B" being printed on the label.
Implementation
Knowing the modulation scheme, target address, and packet contents, it becomes possible to sniff traffic from a Clicker. This is performed by use of the GoodFET firmware on a Next Hope badge, my prior tutorial for which describes the process of packet sniffing.
The NHBadge board contains an nRF24L01+ radio, which differs dramatically from the nRF2401 in terms of how it is configured. Still, the radios are sufficiently compatible. The following hack of the goodfet.nrf client allows packets to be sniffed from the air with proper checksumming.
Sure enough, here are some packets of the 5 button being pressed on unit 1F8760. The keypress is the final byte in ASCII.
Response Codes
Now that it is clear how to receive and recognize button presses, it becomes necessary to reverse engineer the response codes which might be sent from the access point. Without hearing a reply of at least an ACK, the Clicker will continue to broadcast each message more than three hundred times. This takes more than ten seconds, during which all other key presses are ignored.
The broadcast loop within the MAIN() function would look a little like this in C.
for(count=0;count< MAXCOUNT && !reply;count++){
TXPACKET(button);
reply=RADIORX();
}
switch(reply){...}
This region is easy enough to find, but there's another command mode. An easier target is the channel hopping routine, which constantly broadcasts 0x3F while incrementing the channel, sticking with the last one on which a reply of 0x18 was received. Channels 1 through 83 are attempted; that is, 2,401 MHz to 2,483MHz at 1MHz steps.
Checking this code within the MAIN() function reveals that its effect is to blink the green LED (P1.1) six times, exiting the broadcast loop. Other commands include 0x04 (LED Off), 0x06 (LED Green), 0x15 (LED Red), 0x11 (Blink Green), 0x14 (Blink Red), and 0x18 (Blink Green, Channel Lock). All undefined opcodes set the red LED.
Conclusions
By sniffing traffic within a classroom, it is possible to watch votes as they are being cast by students. Similarly, packets could be broadcast by a reprogrammed Clicker or NHBadge to make a student in virtual attendance, automatically voting with the majority so as to gain perfect attendance and a solid C quiz average. Where instant feedback is available, this might even allow for a solid A quiz average. Without taking advantage of the masked-ROM option of the nRF24E1, the code cannot be even slightly protected from extraction and reverse engineering.
Less adventurous users can jam the network by running 'goodfet.nrf carrier 2441000000' to hold a carrier wave on the channel. The only attempt at a frequency change is made when pressing the GO button, at which point the new channel can be discovered and similarly jammed.
Since performing this work, it has come to my attention that a USRP plugin for doing this to the competing 900MHz iClicker product is available as http://gr-clicker.sourceforge.net/. Additionally, the infrared Clicker units were broken with a little tool called Survey Says. I have ordered more sophisticated Clicker models from CPS and Turning Point, and proper descriptions of them will soon follow.
Thursday, June 10, 2010
Hacking the Next Hope Badge
by Travis Goodspeed <travis at radiantmachines.com>
In just less than a month, the Next Hope conference will bring a few thousand neighbors to Manhattan's Hotel Pennsylvania to share all sorts of neighborly ideas. The following are some notes that will help enterprising neighbors to hack these badges, which will be running an MSP430 port of the OpenBeacon firmware. These badges are active RFID tags which beacon the position of each attendee a few times a second, so that the god damned devil army of lies--by which I mean the Next Hope badge committee--can track each attendee around the Hotel Pennsylvania. A second part will continue just before the conference begins, but I hope that this will provide sufficient food for thought.
See http://amd.hope.net/ for a nice little video explaining the purpose of the project as a whole. Those who do not wish to broadcast their positions can remove batteries or reprogram them, but to be thorough, they should turn off their cellular phones as well.
A public HTTP API for querying the badge database is defined in the OpenAMD API Manual, and a server should be available for beta testing before the conference begins. For example, to find the location of user 31337, the client will fetch /api/location?user=31337 then look at the X, Y, and Location fields to determine the users position. As for this article, I will stick the badge hardware, its design, and all sorts of neighborly and malicious things that may be done with it or to it. Little mention will be made of the higher levels of the stack, as those are not my specialty.
Also, in order to keep things fun, I reserve the right to lie about any and all technical details of the badge, its operation, or its security mechanisms. This document by no means complete, and there are still plenty of secrets to find.
The badges themselves are built from an MSP430F2618 or MSP430F2418 microcontroller, as well as an NRF24L01+ 2.4GHz radio. The MSP430 chips were kindly donated by Texas Instruments, and replace the energy guzzling PIC chips of yesteryear. Further, they've got a great C compiler and bootloader, so you cannot brick them no matter how bone-headed your replacement firmware might become.
The back of the badge consists of just a battery clip, but it can optionally be populated with an SSOP28 FT232RL USB to Serial adapter an a mini-USB plug. This USB port then allows for replacement firmware to be loaded, as well as a high-speed serial link to that firmware. Kits will be available containing the parts necessary for this, though you should expect them to sell out quickly.
For the schematic, grab the full-res version of this image.
The layout, for the moment, is private, but you can expect it to be similar to the following cropped image of the prototype. Badges will come in Blue for attendees, Green for speakers, and Red for goons, with prototypes having a white silkscreen.
When the badges have been flashed with the GoodFET firmware, a prebuilt radio client is available in the form of goodfet.nrf. For example, running 'goodfet.nrf sniffob' will sniff the OpenBeacon protocol, while 'goodfet.nrf carrier 2479000000' will place a carrier wave at 2.479 GHz, jamming any carrier-sensing radios on that frequency.
The full Python scripting environment of the GoodFET is available to the NRF port, so it is easy to script this usage. If you'd like to broadcast Morse code by turning the carrier on and off, you can do so without ever touching a C compiler. The same goes for frequency hopping or packet sniffing, although some architectural limitations of the NRF24L01+ make sniffing difficult without knowing the first three bytes of the destination MAC address to be sniffed.
This code is already committed to the mainstream GoodFET repository. Here it is running on Windows XP, sniffing encrypted traffic from a Last Hope badge.
The exact radio channel and addresses will be a surprise for the conference, but there's little harm in my showing you how to extract them from a running firmware image.
Start with a virgin badge; that is, one which has not been reflashed. Solder on a USB chip and connector, then perform the following without disconnecting it from power. The idea here is to replace the existing MSP430 firmware, then extract the contents of the radio's RAM which is not damaged by a reflashing of the MSP430. It might help to have already tested an installation of the GoodFET client, as you rebooting the host PC might cause the radio to lose its settings.
First, reflash the badge with the goodfet firmware by running 'goodfet.bsl --fromweb' in Unix or 'gfbsl.exe -e -p goodfet.hex' in Windows. Once this is complete, you can dump the radio settings by 'goodfet.nrf info'. For a Last Hope badge which was dumped in a similar manner, the results were as follows.
You can get more detail by dumping all registers,
Sniffing traffic with the GoodFET firmware requires only that these same registers be loaded back into the NRF chip, and also that the MAC addresses be swapped. That is, to sniff Last Hope badge traffic, you will want RX_ADDR_P0 to be 0x0102030201 rather than 0x424541434f.
To perform this on a badge as prepared above, simply run the following.
Concerning cryptography, the badges will have it, but that oughtn't stop you from having some fun. Badges are XXTEA encrypted, with keys being published after each conference. A list of old encryption keys can be found at http://wiki.openbeacon.org/wiki/EncryptionKeys, with the key for the above packets being {0x9c43725e,0xad8ec2ab,0x6ebad8db,0xf29c3638}. Example decryption routines are plentiful within the OpenBeacon source repository.
If the badges used more sophisticated radio chips, such as the SPI 802.15.4 chips, an AES128 key would be present within the radio to be read out. Since this isn't the case, the key stays within Flash or RAM of the MSP430 microcontroller.
Flash extraction requires that an attacker gain access to memory through the serial bootstrap loader (BSL) or through JTAG. The BSL is protected by a password, one which might or might not be vulnerable to my timing attack. The timing attack is impossible to perform through the FTDI chip, so a second microcontroller must be wired up to the 0.1" serial port header.
Additionally, it might be possible to extract the contents of Flash through the JTAG port through which these chips are initially programmed. Unlike the bootloader, there is no password protection, but rather a security fuse which is blown to disable future access.
RAM is not protected by the serial bootstrap loader, and you can extract it by first erasing the chip by 'goodfet.bsl -e', then dumping the contents of RAM to disk for later perusal. Somewhere in this mess will be a copy of the XXTEA key, unless I took the time to ensure that is not copied into RAM at startup.
Once decrypted, packets look like the following.
Fields in order are Size, Protocol, Flags, Strength, Sequence, Serial Number, Reserved, and a CRC16 checksum. The 8-bit Flags field indicates the bits of a capacitive multi-touch sensor, while the Reserved field has been co-opted for a secret project of mine. The Source ID is the badge's serial number, which can be found on a sticker that is present on the badge.
The sequence number is incremented for each packet while the last few bits of it determine the broadcast strength. In this example, the badge is rather far from the reader, so all 00 and 55 strength packets are lost, while some AA and FF packets get through. FF being stronger, more of its packets go through. Comparing packet loss rates allows the aggregation server to determine badge positions.
The received signal strength, RSSI, of each packet is not used in this calculation because it is rather primitive in the NRF24L01+, being only a single bit wide. Competing chips, such as the CC2420, would allow this but only at a significant increase in unit cost.
To facilitate badge hacking, there are four breakout headers, all of which have standard 0.1" spacing.
The first is a 14-pin MSP430 JTAG connector, the same used by the MSP430 FET UIF and the GoodFET.
The second is a 6-pin BSL header, in the style of FTDI breakout boards from Sparkfun. This has not been tested, and you had damned well better only use it at 3.3 volts.
The third is a 7-pin breakout connector for the NRF24L01+ radio and SPI bus. You might use it to add an additional SPI chip, such as a second radio or an LCD. The pins are (1) GND, (2) CE, (3) !CS, (4) SCK, (5) MOSI, (6) MISO, and (7) !IRQ. The !IRQ signal is only asserted by the NRF when configured to do so, so it might be coopted to act as a !CS pin for a second SPI device.
The fourth and final header is an 8-pin breakout of Port 3 of the MSP430 microcontroller. By default, it is used as a capacitive touch sensor, but it might also be used for any sort of I/O expansion. In addition to GPIO, some hardware accelerated ports are available on this pins. I'll leave you to the datasheet to figure them out.
The ANT wireless protocol can be implemented with the NRF24L01+ of this badge, though technical details on exactly how to do it are rather difficult to find. I'd be much obliged if some neighbors brought ANT equipment to the conference for the rest of us to play with.
Sparkfun offers a number of NRF24L01 modules, my favorite of which is a Key Fob Remote.
These details should help you to spend more time hacking, and less time researching, during the conference. A special prize will be given for the most original badge modification, with heavy credit going toward those of high technical caliber. There are also some secrets to be found within the badges, so best to bring b
I will be presenting this and a several more tricks as a lecture during the conference, entitled "Building and Breaking the Next Hope Badge" at 22h00 on Saturday, July 17th in the Tesla room. There will also be a panel presentation entitled "The OpenAMD Project" at 18h00 on Friday, July 16th in the Lovelace room. Both rooms are on the 18th floor.
In just less than a month, the Next Hope conference will bring a few thousand neighbors to Manhattan's Hotel Pennsylvania to share all sorts of neighborly ideas. The following are some notes that will help enterprising neighbors to hack these badges, which will be running an MSP430 port of the OpenBeacon firmware. These badges are active RFID tags which beacon the position of each attendee a few times a second, so that the god damned devil army of lies--by which I mean the Next Hope badge committee--can track each attendee around the Hotel Pennsylvania. A second part will continue just before the conference begins, but I hope that this will provide sufficient food for thought.
See http://amd.hope.net/ for a nice little video explaining the purpose of the project as a whole. Those who do not wish to broadcast their positions can remove batteries or reprogram them, but to be thorough, they should turn off their cellular phones as well.
A public HTTP API for querying the badge database is defined in the OpenAMD API Manual, and a server should be available for beta testing before the conference begins. For example, to find the location of user 31337, the client will fetch /api/location?user=31337 then look at the X, Y, and Location fields to determine the users position. As for this article, I will stick the badge hardware, its design, and all sorts of neighborly and malicious things that may be done with it or to it. Little mention will be made of the higher levels of the stack, as those are not my specialty.
Also, in order to keep things fun, I reserve the right to lie about any and all technical details of the badge, its operation, or its security mechanisms. This document by no means complete, and there are still plenty of secrets to find.
Badge Hardware, Usage
The badges themselves are built from an MSP430F2618 or MSP430F2418 microcontroller, as well as an NRF24L01+ 2.4GHz radio. The MSP430 chips were kindly donated by Texas Instruments, and replace the energy guzzling PIC chips of yesteryear. Further, they've got a great C compiler and bootloader, so you cannot brick them no matter how bone-headed your replacement firmware might become.
The back of the badge consists of just a battery clip, but it can optionally be populated with an SSOP28 FT232RL USB to Serial adapter an a mini-USB plug. This USB port then allows for replacement firmware to be loaded, as well as a high-speed serial link to that firmware. Kits will be available containing the parts necessary for this, though you should expect them to sell out quickly.
For the schematic, grab the full-res version of this image.
The layout, for the moment, is private, but you can expect it to be similar to the following cropped image of the prototype. Badges will come in Blue for attendees, Green for speakers, and Red for goons, with prototypes having a white silkscreen.
GoodFET Firmware
When the badges have been flashed with the GoodFET firmware, a prebuilt radio client is available in the form of goodfet.nrf. For example, running 'goodfet.nrf sniffob' will sniff the OpenBeacon protocol, while 'goodfet.nrf carrier 2479000000' will place a carrier wave at 2.479 GHz, jamming any carrier-sensing radios on that frequency.
The full Python scripting environment of the GoodFET is available to the NRF port, so it is easy to script this usage. If you'd like to broadcast Morse code by turning the carrier on and off, you can do so without ever touching a C compiler. The same goes for frequency hopping or packet sniffing, although some architectural limitations of the NRF24L01+ make sniffing difficult without knowing the first three bytes of the destination MAC address to be sniffed.
This code is already committed to the mainstream GoodFET repository. Here it is running on Windows XP, sniffing encrypted traffic from a Last Hope badge.
Radio Configuration Extraction
The exact radio channel and addresses will be a surprise for the conference, but there's little harm in my showing you how to extract them from a running firmware image.
Start with a virgin badge; that is, one which has not been reflashed. Solder on a USB chip and connector, then perform the following without disconnecting it from power. The idea here is to replace the existing MSP430 firmware, then extract the contents of the radio's RAM which is not damaged by a reflashing of the MSP430. It might help to have already tested an installation of the GoodFET client, as you rebooting the host PC might cause the radio to lose its settings.
First, reflash the badge with the goodfet firmware by running 'goodfet.bsl --fromweb' in Unix or 'gfbsl.exe -e -p goodfet.hex' in Windows. Once this is complete, you can dump the radio settings by 'goodfet.nrf info'. For a Last Hope badge which was dumped in a similar manner, the results were as follows.
air-2% goodfet.nrf info
Encoding GFSK
Freq 2481 MHz
Rate 2000 kbps
PacketLen 16 bytes
MacLen 5 bytes
SMAC 0x424541434f
TMAC 0x0102030201
air-2%
You can get more detail by dumping all registers,
air-2% goodfet.nrf regs
r[0x00]=0x000000000a // CONFIG
r[0x01]=0x0000000000 // EN_AA
r[0x02]=0x0000000001 // EN_RXADDR
r[0x03]=0x0000000003 // SETUP_AW
r[0x04]=0x0000000003 // SETUP_RET
r[0x05]=0x0000000051 // RF_CH
r[0x06]=0x000000000f // RF_SETUP
r[0x07]=0x000000002e // STATUS
r[0x08]=0x0000000000 // OBSERVE_TX
r[0x09]=0x0000000000 // RPD
r[0x0a]=0x424541434f // RX_ADDR_P0
r[0x0b]=0xc2c2c2c2c2 // RX_ADDR_P1
r[0x0c]=0x00000000c3 // RX_ADDR_P2
r[0x0d]=0x00000000c4 // RX_ADDR_P3
r[0x0e]=0x00000000c5 // RX_ADDR_P4
r[0x0f]=0x00000000c6 // RX_ADDR_P5
r[0x10]=0x0102030201 // TX_ADDR
r[0x11]=0x0000000010 // RX_PW_P0
r[0x12]=0x0000000000 // RX_PW_P1
r[0x13]=0x0000000000 // RX_PW_P2
r[0x14]=0x0000000000 // RX_PW_P3
r[0x15]=0x0000000000 // RX_PW_P4
r[0x16]=0x0000000000 // RX_PW_P5
r[0x17]=0x0000000011 // FIFO_STATUS
r[0x18]=0x0000000000 // ?
r[0x19]=0x0000000000 // ?
r[0x1a]=0x0000000000 // ?
r[0x1b]=0x0000000000 // DYNPD
r[0x1c]=0x0000000000 // ?
r[0x1d]=0x0000000000 // ?
r[0x1e]=0x0000000000 // ?
r[0x1f]=0x0000000000 // ?
air-2%
Sniffing traffic with the GoodFET firmware requires only that these same registers be loaded back into the NRF chip, and also that the MAC addresses be swapped. That is, to sniff Last Hope badge traffic, you will want RX_ADDR_P0 to be 0x0102030201 rather than 0x424541434f.
To perform this on a badge as prepared above, simply run the following.
air-2% goodfet.nrf poke 0x0a 0x0102030201
Poking 0a to become 0102030201.
Poked to 102030201
air-2% goodfet.nrf sniff
Listening as 0102030201 on 2481 MHz
dd 8f 4a 5b ff 7c bb 76 09 42 a6 ec 61 6f 9a db
90 bb 2f cd 06 81 e9 36 20 9c 4c 23 b3 10 6c c7
37 7a 37 c5 93 57 2b 24 6a 9d 9a 8b 3c 52 1c 23
56 a8 04 f5 a7 ed 26 0b 24 ec 39 9d 10 fb da 76
ba b5 d0 5c 89 4d 1c 63 19 28 a1 9d 35 e6 7f a5
ec 63 5f 60 b8 0f 1c bf 4c e6 af 93 c2 fe 93 ee
ad fc a1 25 42 81 7a a1 28 a8 f5 21 4a 7a 55 af
79 42 5c 6d 38 ca 46 ab 1b 8c ab 90 ad 47 90 d1
f6 9a 22 0d e4 37 19 b7 75 34 8d 4f f9 9c fd 2a
^C
air-2%
Key Extraction
Concerning cryptography, the badges will have it, but that oughtn't stop you from having some fun. Badges are XXTEA encrypted, with keys being published after each conference. A list of old encryption keys can be found at http://wiki.openbeacon.org/wiki/EncryptionKeys, with the key for the above packets being {0x9c43725e,0xad8ec2ab,0x6ebad8db,0xf29c3638}. Example decryption routines are plentiful within the OpenBeacon source repository.
If the badges used more sophisticated radio chips, such as the SPI 802.15.4 chips, an AES128 key would be present within the radio to be read out. Since this isn't the case, the key stays within Flash or RAM of the MSP430 microcontroller.
Flash extraction requires that an attacker gain access to memory through the serial bootstrap loader (BSL) or through JTAG. The BSL is protected by a password, one which might or might not be vulnerable to my timing attack. The timing attack is impossible to perform through the FTDI chip, so a second microcontroller must be wired up to the 0.1" serial port header.
Additionally, it might be possible to extract the contents of Flash through the JTAG port through which these chips are initially programmed. Unlike the bootloader, there is no password protection, but rather a security fuse which is blown to disable future access.
RAM is not protected by the serial bootstrap loader, and you can extract it by first erasing the chip by 'goodfet.bsl -e', then dumping the contents of RAM to disk for later perusal. Somewhere in this mess will be a copy of the XXTEA key, unless I took the time to ensure that is not copied into RAM at startup.
Packet Format
Once decrypted, packets look like the following.
SZ PR FL ST SEQUENCENUM SOURCEIDXXX RESVD CRC16
10 17 00 ff 00 0a 6f a7 ff ff ff ff 00 00 e7 53
10 17 00 ff 00 0a 6f ab ff ff ff ff 00 00 b5 38
10 17 00 aa 00 0a 6f ae ff ff ff ff 00 00 54 79
10 17 00 ff 00 0a 6f b3 ff ff ff ff 00 00 11 ee
10 17 00 ff 00 0a 6f b7 ff ff ff ff 00 00 d0 28
10 17 00 aa 00 0a 6f ba ff ff ff ff 00 00 a2 c4
Fields in order are Size, Protocol, Flags, Strength, Sequence, Serial Number, Reserved, and a CRC16 checksum. The 8-bit Flags field indicates the bits of a capacitive multi-touch sensor, while the Reserved field has been co-opted for a secret project of mine. The Source ID is the badge's serial number, which can be found on a sticker that is present on the badge.
The sequence number is incremented for each packet while the last few bits of it determine the broadcast strength. In this example, the badge is rather far from the reader, so all 00 and 55 strength packets are lost, while some AA and FF packets get through. FF being stronger, more of its packets go through. Comparing packet loss rates allows the aggregation server to determine badge positions.
The received signal strength, RSSI, of each packet is not used in this calculation because it is rather primitive in the NRF24L01+, being only a single bit wide. Competing chips, such as the CC2420, would allow this but only at a significant increase in unit cost.
Breakouts
To facilitate badge hacking, there are four breakout headers, all of which have standard 0.1" spacing.
The first is a 14-pin MSP430 JTAG connector, the same used by the MSP430 FET UIF and the GoodFET.
The second is a 6-pin BSL header, in the style of FTDI breakout boards from Sparkfun. This has not been tested, and you had damned well better only use it at 3.3 volts.
The third is a 7-pin breakout connector for the NRF24L01+ radio and SPI bus. You might use it to add an additional SPI chip, such as a second radio or an LCD. The pins are (1) GND, (2) CE, (3) !CS, (4) SCK, (5) MOSI, (6) MISO, and (7) !IRQ. The !IRQ signal is only asserted by the NRF when configured to do so, so it might be coopted to act as a !CS pin for a second SPI device.
The fourth and final header is an 8-pin breakout of Port 3 of the MSP430 microcontroller. By default, it is used as a capacitive touch sensor, but it might also be used for any sort of I/O expansion. In addition to GPIO, some hardware accelerated ports are available on this pins. I'll leave you to the datasheet to figure them out.
Compatible Hardware
The ANT wireless protocol can be implemented with the NRF24L01+ of this badge, though technical details on exactly how to do it are rather difficult to find. I'd be much obliged if some neighbors brought ANT equipment to the conference for the rest of us to play with.
Sparkfun offers a number of NRF24L01 modules, my favorite of which is a Key Fob Remote.
Stay Tuned
These details should help you to spend more time hacking, and less time researching, during the conference. A special prize will be given for the most original badge modification, with heavy credit going toward those of high technical caliber. There are also some secrets to be found within the badges, so best to bring b
I will be presenting this and a several more tricks as a lecture during the conference, entitled "Building and Breaking the Next Hope Badge" at 22h00 on Saturday, July 17th in the Tesla room. There will also be a panel presentation entitled "The OpenAMD Project" at 18h00 on Friday, July 16th in the Lovelace room. Both rooms are on the 18th floor.
Monday, April 26, 2010
CUDA PTX Extraction
by Travis Goodspeed <travis at radiantmachines.com>
concerning CUDA 3.x on Darwin
as the first of several CUDA articles.
The following are some brief introductory notes on dumping PTX kernels from modern CUDA applications, as well as techniques for embedding them within new applications. One or two copies of every shader are stored as ASCII strings within each CUDA executable by default. With a little ingenuity, the contents of this article and a decent machine with a G80 card should provide a decent start in reverse engineering general-purpose GPU applications.
Nvidia's CUDA framework for GPU computing uses a portable meta-assembly language, PTX (ptx_isa_2.0.pdf), to facilitate translation between multiple GPU devices. In this manner, they can escape the backward compatibility issues that hold most modern CPU architectures to a single instruction set. PTX vaguely resembled the underlying machine code, but it lacks features which would tie it to any particular GPU. In this brief article, I present a trivial method for extracting PTX assembly from CUDA applications, as well as some pointers for merging that code into new applications.
The screenshot below shows a fragment of libcublas.dylib from CUDA 3.0 in Snow Leopard being edited in Emacs. Following the dozens of assembler directives are individual VM opcodes. (This contains Basic Linear Algebra Subprograms. As this comes from Fortran, not C, the code is a bit weird.) By default, CUDA will compile inline code in a language similar to C into PTX assembly, then include the PTX assembly string verbatim into the resulting executable or library. User comments are not preserved, but compiler comments are introduced and names are unaltered. Except where hand-written, the compiler will always be nvopencc.
The nvopencc compiler has a number of quirks when writing PTX:
Further,
To dump the PTX code from a binary, Mach-O executable, just scan the input for long strings, printing everything starting with "\t.version". In my own setup, I have an ugly C program that prints these, passing them off to the unix split command for separation into multiple PTX files.
The 312 PTX scripts from CUBLAS are mostly small, with only nine of them having source in excess of a megabyte but none being larger than two megabytes. Thus, you'll need a rather long string buffer. Additionally, it is handy to purge the buffer when no fragment of a PTX executable is found and whenever a null byte is encountered. You can find the PTX from the vectorAdd example at http://pastebin.com/nqKqKhNc.
Applications can be compiled without PTX inclusion, using machine-language CUBIN files instead. This has the disadvantage of not being forward-compatible, and thanks to Wladimir J. van der Laan's Decuda project, it isn't much more difficult to read.
To try this out yourself, first build a dumping script based upon the CUDA examples and libraries. Once you have that, try downloading a few of the more advanced demos. The Nvidia Graphics Plus demos might be a good target, as would any game advertising CUDA support.
Having dumped a PTX script, it is handy to link it back into an existing project. For this, you will want to use the matrixMulDynlinkJIT or ptxjit examples that come with the CUDA development kit. These projects use the cuModuleLoadDataEx() method to link a PTX script from a string, then cuModuleGetFunction() to grab a CUfunction pointer to any function.
Conveniently, the PTX scripts include symbol names, but as with any complex compiler, these have been somewhat mangled. In the addVector example, the entry point hask been mangled to _Z6VecAddPKfS0_Pfi for both sm_10 and sm_20. It is this function name, and not the simpler VecAdd, that must be passed to cuModuleGetFunction().
This is the code that the ptxjit example uses to load a kernel named _Z8myKernelPi kernel contained within the myPtx[] character array. Looking at the string itself, which is defined within ptxjit.h, it can be seen that the code was rather hastily dumped by a method similar to the one I describe above.
GPU programming is sufficiently confusing when source code is available that the lifting of code oughtn't be a concern. Generally only small fragments are executed within the GPU, with the majority of development time being spent debugging those fragments and twisting them for different physical optimizations.
Daniel Reynaud's talk on GPU Powered Malware at Ruxcon 2008 proposed that GPU programs might be useful for malware URL generation. It goes without saying that sophisticated malware will do better than to include an unencoded ASCII string. Pre-assembled bytecode can be provided directly to the card, avoiding the inclusion of a PTX string. While some of Reynaud's points are less relevant now that CUDA has debugging and bytecode emulation, the core of his argument that GPU packers will become important is still valid. For starters, it is possible use a pegged memory segment to have GPU code rewrite host X86 code on the fly without a context switch!
Expect some follow-up articles on the neighborly things that can be done once your hands are inside the beast that is CUDA.
concerning CUDA 3.x on Darwin
as the first of several CUDA articles.
The following are some brief introductory notes on dumping PTX kernels from modern CUDA applications, as well as techniques for embedding them within new applications. One or two copies of every shader are stored as ASCII strings within each CUDA executable by default. With a little ingenuity, the contents of this article and a decent machine with a G80 card should provide a decent start in reverse engineering general-purpose GPU applications.
Nvidia's CUDA framework for GPU computing uses a portable meta-assembly language, PTX (ptx_isa_2.0.pdf), to facilitate translation between multiple GPU devices. In this manner, they can escape the backward compatibility issues that hold most modern CPU architectures to a single instruction set. PTX vaguely resembled the underlying machine code, but it lacks features which would tie it to any particular GPU. In this brief article, I present a trivial method for extracting PTX assembly from CUDA applications, as well as some pointers for merging that code into new applications.
Dumping
The screenshot below shows a fragment of libcublas.dylib from CUDA 3.0 in Snow Leopard being edited in Emacs. Following the dozens of assembler directives are individual VM opcodes. (This contains Basic Linear Algebra Subprograms. As this comes from Fortran, not C, the code is a bit weird.) By default, CUDA will compile inline code in a language similar to C into PTX assembly, then include the PTX assembly string verbatim into the resulting executable or library. User comments are not preserved, but compiler comments are introduced and names are unaltered. Except where hand-written, the compiler will always be nvopencc.
The nvopencc compiler has a number of quirks when writing PTX:
- Every line that is not a label is tabbed in at least once.
- The first directive is .version, the second is .target.
- Registers are declared in groups.
- Names are preserved, with C++ mangling.
Further,
- Every PTX script ends with a null (0x00) byte.
- While they can be big, no PTX script is larger than two megabytes.
- PTX scripts come in pairs, one for sm_20 and another for sm_10.
To dump the PTX code from a binary, Mach-O executable, just scan the input for long strings, printing everything starting with "\t.version". In my own setup, I have an ugly C program that prints these, passing them off to the unix split command for separation into multiple PTX files.
The 312 PTX scripts from CUBLAS are mostly small, with only nine of them having source in excess of a megabyte but none being larger than two megabytes. Thus, you'll need a rather long string buffer. Additionally, it is handy to purge the buffer when no fragment of a PTX executable is found and whenever a null byte is encountered. You can find the PTX from the vectorAdd example at http://pastebin.com/nqKqKhNc.
Applications can be compiled without PTX inclusion, using machine-language CUBIN files instead. This has the disadvantage of not being forward-compatible, and thanks to Wladimir J. van der Laan's Decuda project, it isn't much more difficult to read.
To try this out yourself, first build a dumping script based upon the CUDA examples and libraries. Once you have that, try downloading a few of the more advanced demos. The Nvidia Graphics Plus demos might be a good target, as would any game advertising CUDA support.
PTX JIT
Having dumped a PTX script, it is handy to link it back into an existing project. For this, you will want to use the matrixMulDynlinkJIT or ptxjit examples that come with the CUDA development kit. These projects use the cuModuleLoadDataEx() method to link a PTX script from a string, then cuModuleGetFunction() to grab a CUfunction pointer to any function.
Conveniently, the PTX scripts include symbol names, but as with any complex compiler, these have been somewhat mangled. In the addVector example, the entry point hask been mangled to _Z6VecAddPKfS0_Pfi for both sm_10 and sm_20. It is this function name, and not the simpler VecAdd, that must be passed to cuModuleGetFunction().
This is the code that the ptxjit example uses to load a kernel named _Z8myKernelPi kernel contained within the myPtx[] character array. Looking at the string itself, which is defined within ptxjit.h, it can be seen that the code was rather hastily dumped by a method similar to the one I describe above.
Caveats
GPU programming is sufficiently confusing when source code is available that the lifting of code oughtn't be a concern. Generally only small fragments are executed within the GPU, with the majority of development time being spent debugging those fragments and twisting them for different physical optimizations.
Daniel Reynaud's talk on GPU Powered Malware at Ruxcon 2008 proposed that GPU programs might be useful for malware URL generation. It goes without saying that sophisticated malware will do better than to include an unencoded ASCII string. Pre-assembled bytecode can be provided directly to the card, avoiding the inclusion of a PTX string. While some of Reynaud's points are less relevant now that CUDA has debugging and bytecode emulation, the core of his argument that GPU packers will become important is still valid. For starters, it is possible use a pegged memory segment to have GPU code rewrite host X86 code on the fly without a context switch!
Expect some follow-up articles on the neighborly things that can be done once your hands are inside the beast that is CUDA.
Wednesday, March 24, 2010
Smartgrid Skunkworks
Dearest engineers and hackers, and also their management,
Recent vulnerabilities found in smart meters and HAN devices have shown a number of weaknesses in the engineering practices used to build these devices and their constituent components. A vulnerability in a chip or library is fixed slowly, and it is a very rare event that the meter and thermostat vendors affected by the vulnerability are notified by their suppliers. Because of this, vulnerabilities are spreading downward through the supply chain, and the engineers of smart grid devices are left uninformed.
While those utilities that actively investigate security have a considerable amount of bargaining power with their immediate suppliers, the rest of the supply chain has no similar leverage to compel security notifications. Chip and library vendors are failing to notify the meter vendors that depend upon their components. Even when the meter vendors are notified directly of vulnerabilities, thermostat and other HAN vendors can have no realistic expectation of such a privilege.
Despite having found many vulnerabilities in microcontrollers and LPAN radio chips, I have never seen one single security issue mentioned in the errata sheets of these devices. It has been a year since I first reported to Texas Instruments that the RAM of their Chipcon 8051 core is exposed to an attacker, but there's not one scrap of documentation from the firm to its customers suggesting that they make the simple patch of moving the key variables to Flash memory. The example ZigBee stack for the chip is still vulnerable to this attack, even after recent patches! A year later, exactly two debugger commands are all that are required to extract keys from nearly every ZigBee SEP device with a Chipcon radio, and no one knows to patch their code! (Do not be smug if you are an Ember customer. The EM2xx chips are unpatchably vulnerable to debugger key extraction, and there is no mention of this in the chip's errata sheet either.)
As chip and library vendors have failed to document the publicly known vulnerabilities in their products, and as they have often been unable or unwilling to repair them, the most expedient remedy to this problem is a separate line of communication. At least one point of reference must exist for the engineers trying to build these products.
For these reasons, I have created a skunkworks mailing list for the announcement and discussion of smart grid vulnerabilities, particularly but not exclusively those in AMI equipment. This is to be a list for engineering discussion, by engineers and security researchers. Anonymous posts and lurking are welcome, but politics and committee items are not.
For this reason, I especially request that those firms which care about security ask--or perhaps even require--their engineering staff to subscribe. This list is the appropriate place to post questions concerning the secure use of a particular radio chip, fragment of code, or anything else which is too low level or vendor-specific to be mentioned in standards.
If your firm is unwilling to allow its engineers to post, please at least compel them to follow the posts of others. In saying nothing, they will still learn how to make more secure products along with all sorts of fascinating gossip about your competitors. Your firm has every right to keep its mouth shut, but keeping its ears shut is a betrayal of each and every one of your customers.
To kickstart this mailing list, I will make it my first site of public disclosure for smart grid vulnerabilities over the coming months. The subscription link is below, and I invite you to join me in preventing smart grid vulnerabilities before they are created.
http://groups.google.com/group/smartgrid-skunkworks
Thank you kindly,
--Travis Goodspeed
Belt Buckle Engineer
Security Hobbyist
Recent vulnerabilities found in smart meters and HAN devices have shown a number of weaknesses in the engineering practices used to build these devices and their constituent components. A vulnerability in a chip or library is fixed slowly, and it is a very rare event that the meter and thermostat vendors affected by the vulnerability are notified by their suppliers. Because of this, vulnerabilities are spreading downward through the supply chain, and the engineers of smart grid devices are left uninformed.
While those utilities that actively investigate security have a considerable amount of bargaining power with their immediate suppliers, the rest of the supply chain has no similar leverage to compel security notifications. Chip and library vendors are failing to notify the meter vendors that depend upon their components. Even when the meter vendors are notified directly of vulnerabilities, thermostat and other HAN vendors can have no realistic expectation of such a privilege.
Despite having found many vulnerabilities in microcontrollers and LPAN radio chips, I have never seen one single security issue mentioned in the errata sheets of these devices. It has been a year since I first reported to Texas Instruments that the RAM of their Chipcon 8051 core is exposed to an attacker, but there's not one scrap of documentation from the firm to its customers suggesting that they make the simple patch of moving the key variables to Flash memory. The example ZigBee stack for the chip is still vulnerable to this attack, even after recent patches! A year later, exactly two debugger commands are all that are required to extract keys from nearly every ZigBee SEP device with a Chipcon radio, and no one knows to patch their code! (Do not be smug if you are an Ember customer. The EM2xx chips are unpatchably vulnerable to debugger key extraction, and there is no mention of this in the chip's errata sheet either.)
As chip and library vendors have failed to document the publicly known vulnerabilities in their products, and as they have often been unable or unwilling to repair them, the most expedient remedy to this problem is a separate line of communication. At least one point of reference must exist for the engineers trying to build these products.
For these reasons, I have created a skunkworks mailing list for the announcement and discussion of smart grid vulnerabilities, particularly but not exclusively those in AMI equipment. This is to be a list for engineering discussion, by engineers and security researchers. Anonymous posts and lurking are welcome, but politics and committee items are not.
For this reason, I especially request that those firms which care about security ask--or perhaps even require--their engineering staff to subscribe. This list is the appropriate place to post questions concerning the secure use of a particular radio chip, fragment of code, or anything else which is too low level or vendor-specific to be mentioned in standards.
If your firm is unwilling to allow its engineers to post, please at least compel them to follow the posts of others. In saying nothing, they will still learn how to make more secure products along with all sorts of fascinating gossip about your competitors. Your firm has every right to keep its mouth shut, but keeping its ears shut is a betrayal of each and every one of your customers.
To kickstart this mailing list, I will make it my first site of public disclosure for smart grid vulnerabilities over the coming months. The subscription link is below, and I invite you to join me in preventing smart grid vulnerabilities before they are created.
http://groups.google.com/group/smartgrid-skunkworks
Thank you kindly,
--Travis Goodspeed
Belt Buckle Engineer
Security Hobbyist
Tuesday, March 9, 2010
IM ME GoodFET Wiring Tutorial
by Travis Goodspeed <travis at radiantmachines.com>
concerning the Girltech IM ME,
with a million thanks to Dave.
WARNING: Reflashing the CC1110 while batteries are low will permanently lock the chip. Either be damned sure to use fresh batteries or leave the batteries out and power the IMME from your GoodFET.
Howdy y'all,
This brief tutorial describes the process of reflashing the Girltech IM ME with custom firmware, so that it may be used as a development platform for the Chipcon CC1110 sub-GHz ISM System-on-Chip. I assume the reader to have an assembled GoodFET with recent firmware, but other programmers may of course be substituted.
You should also read Dave's first article on IM ME hacking, as it describes his method for reprogramming the device. All the pinouts below were taken from his articles, as well as the keyboard and LCD information that he was so neighborly as to publish.
First, you'll need to purchase an IM ME, which can be had for $20 USD on a few toy sites while it remains in stock. You'll also need an assembled GoodFET and basic electronics tools.
The testpoints used for programming the IM ME are located behind the batteries in the rear compartment of the device. Ideally, a bed of nails should be used to clip into it, but failing that, just solder on to the Debug Data (DD), Debug Clock (DC), Reset (!RST), and GND pins. Run these to the GoodFET's 14-pin header as shown below.
From left to right on the IM ME, the pins are !RST, DD, DC, +2.5V, and Ground. Because the GoodFET is a low-voltage device, there's no need for the resistor dividers in Dave's article. Use EITHER the GoodFET OR the batteries for VCC, but not both.
Once you have the IM ME wired up, you can check its model number and status by running `goodfet.cc status'. This will tell you that the chip is locked, so making a backup of its firmware is non-trivial. If you continue from here, the IM ME will no longer function as an instant messenger.
Erase the chip by 'goodfet.cc erase' then dump an image of RAM as 'goodfet.cc dumpdata immeram.hex' to see if anything neighborly can be found inside.
You now have a blank IM ME, with the LCD most likely showing the last gasping breaths of its firmware. To flash a new firmware image, just grab its ihex file and run 'goodfet.cc flash foo.hex'.
I've placed a few example binaries in the repository of an operating system that I've started for the IM ME called GoodME. To flash Dave's LCD Test, run the following commands.
svn co https://goodfet.svn.sourceforge.net/svnroot/goodme
goodfet.cc flash goodme/bins/dave-lcdtest.hex
For a more functional demo, try bins/term-morse824mhz.hex, an ugly hack of an operating system for the IM ME with a Morse code transmitter and random number generator demo. In the Radio demo, holding any of the letter buttons broadcasts on 824MHz. The PRNG demo, shown below, demonstrates the repetition of strings withing the psuedo-random number generator and counts the number of bytes between them. This is sometimes used for key material.
The SDCC compiler is in the package repositories of most civilized operating systems. You might need a more recent version for the cc1110.h header, though building this compiler is a thousand times simpler than GCC. Compiling an example is as simple as sdcc foo.c; packihx <foo.ihx >foo.hex, which will produce a suitable Intel Hex file for flashing. The 8051 memory model makes specifying a chip model unnecessary, a handy deviation from those of us with a thousand MSP430 linking scripts.
Within the GoodME repository, you'll find my bastard child of an operating system at /branches/rough/. It was used to make the term-morse824mhz.hex, and its keyboard, font, and LCD drivers are ripe for organ transplants. /trunk/ ought to someday contain a proper operating system for the device, but for now, I haven't the time to complete it.
Have fun, and build something neighborly,
--Travis
concerning the Girltech IM ME,
with a million thanks to Dave.
WARNING: Reflashing the CC1110 while batteries are low will permanently lock the chip. Either be damned sure to use fresh batteries or leave the batteries out and power the IMME from your GoodFET.
Howdy y'all,
This brief tutorial describes the process of reflashing the Girltech IM ME with custom firmware, so that it may be used as a development platform for the Chipcon CC1110 sub-GHz ISM System-on-Chip. I assume the reader to have an assembled GoodFET with recent firmware, but other programmers may of course be substituted.
You should also read Dave's first article on IM ME hacking, as it describes his method for reprogramming the device. All the pinouts below were taken from his articles, as well as the keyboard and LCD information that he was so neighborly as to publish.
Wiring
First, you'll need to purchase an IM ME, which can be had for $20 USD on a few toy sites while it remains in stock. You'll also need an assembled GoodFET and basic electronics tools.
The testpoints used for programming the IM ME are located behind the batteries in the rear compartment of the device. Ideally, a bed of nails should be used to clip into it, but failing that, just solder on to the Debug Data (DD), Debug Clock (DC), Reset (!RST), and GND pins. Run these to the GoodFET's 14-pin header as shown below.
From left to right on the IM ME, the pins are !RST, DD, DC, +2.5V, and Ground. Because the GoodFET is a low-voltage device, there's no need for the resistor dividers in Dave's article. Use EITHER the GoodFET OR the batteries for VCC, but not both.
Name | Pin |
| Name |
---|---|---|---|
DD | 1 | 2 | Vcc |
| 3 | 4 | Vcc |
RST | 5 | 6 |
|
DC | 7 | 8 |
|
GND | 9 | 10 |
|
| 11 | 12 | |
| 13 | 14 |
|
Flashing
Once you have the IM ME wired up, you can check its model number and status by running `goodfet.cc status'. This will tell you that the chip is locked, so making a backup of its firmware is non-trivial. If you continue from here, the IM ME will no longer function as an instant messenger.
Erase the chip by 'goodfet.cc erase' then dump an image of RAM as 'goodfet.cc dumpdata immeram.hex' to see if anything neighborly can be found inside.
You now have a blank IM ME, with the LCD most likely showing the last gasping breaths of its firmware. To flash a new firmware image, just grab its ihex file and run 'goodfet.cc flash foo.hex'.
I've placed a few example binaries in the repository of an operating system that I've started for the IM ME called GoodME. To flash Dave's LCD Test, run the following commands.
svn co https://goodfet.svn.sourceforge.net/svnroot/goodme
goodfet.cc flash goodme/bins/dave-lcdtest.hex
For a more functional demo, try bins/term-morse824mhz.hex, an ugly hack of an operating system for the IM ME with a Morse code transmitter and random number generator demo. In the Radio demo, holding any of the letter buttons broadcasts on 824MHz. The PRNG demo, shown below, demonstrates the repetition of strings withing the psuedo-random number generator and counts the number of bytes between them. This is sometimes used for key material.
Custom Development
The SDCC compiler is in the package repositories of most civilized operating systems. You might need a more recent version for the cc1110.h header, though building this compiler is a thousand times simpler than GCC. Compiling an example is as simple as sdcc foo.c; packihx <foo.ihx >foo.hex, which will produce a suitable Intel Hex file for flashing. The 8051 memory model makes specifying a chip model unnecessary, a handy deviation from those of us with a thousand MSP430 linking scripts.
Within the GoodME repository, you'll find my bastard child of an operating system at /branches/rough/. It was used to make the term-morse824mhz.hex, and its keyboard, font, and LCD drivers are ripe for organ transplants. /trunk/ ought to someday contain a proper operating system for the device, but for now, I haven't the time to complete it.
Have fun, and build something neighborly,
--Travis
Sunday, February 21, 2010
XML of SmartRF Studio 7
by Travis Goodspeed <travis at radiantmachines.com>
concerning TI SmartRF Studio 7.
For those who have not personally suffered the experience, choosing radio register values is an absolute pain. In this brief article, I will demonstrate a method for extracting settings in bulk from SmartRF 7 Studio for use in other projects.
Suppose that a program should configure a CC1110 to operate on a particular frequency, in a particular encoding, etc. The code will like the following, which is from the CC1110 examples. This will generate a carrier wave near 823 MHz, as the 2.433GHz FREQ[] value is outside of the allowed range.
To choose a different center frequency, the engineer is expected to either find the register's definition within the datasheet or use SmartRF Studio, a screenshot of which is below. The software is a Windows application for communicating with a radio through a serial port. It also allows an engineer to load preconfigured profiles for certain types of modulation, such as IEEE 802.15.4 or SimpliciTI. Once loaded, the settings may be tweaked and loaded into a packet-sniffer firmware for debugging projects.
The screenshot above, from SmartRF 6, can provide a decent idea of how infuriating the software might become. For product development, the register settings are to be copied and pasted into opaque source files. It can instill the same sort of anger as an uncooperative web form, even after the old Visual Studio 6 libraries have been located to make it work under Wine.
SmartRF 7--at Beta 0.9.1 as of this writing--is a new client under active development, one which seems to have been rewritten from scratch or at least significantly refactored. The old Win32 GUI has been replaced with QT4, making a future Linux or Mac port possible. Most importantly of all, however, is that the default configurations, as well as register explanations, are stored as XML.
The screenshot above shows how default configurations are chosen for the target radio. Users can select an example, then change any of the settings they like. These configurations are further abstracted into an "Easy Mode" which hides all the messy minutia of radio, abstracting use to standards and channel numbers.
These configuration settings are to be found as XML in the config/xml/ directory, organized by chip and presentation. The following is an example of one such configuration file for the CC1110.
Patching these configurations allows for the easy addition of new standards. For example, I added support for my Tennessee Belt Buckle radio by adding a new stanza to easymode_settings.xml.
The register definitions and their bitfields are also defined in XML. The following from register_definition.xml for the CC1110 described the FREQ2 register's meaning.
Printing SDCC special-function register definitions, such as those found in cc1110.h, is as easy as a bit of Python magic,
These configuration files should allow for new Chipcon radio applications and development tools to be written in short order. Please contact me if you would be so kind as to write a complete Python class for querying this data.
concerning TI SmartRF Studio 7.
For those who have not personally suffered the experience, choosing radio register values is an absolute pain. In this brief article, I will demonstrate a method for extracting settings in bulk from SmartRF 7 Studio for use in other projects.
Suppose that a program should configure a CC1110 to operate on a particular frequency, in a particular encoding, etc. The code will like the following, which is from the CC1110 examples. This will generate a carrier wave near 823 MHz, as the 2.433GHz FREQ[] value is outside of the allowed range.
To choose a different center frequency, the engineer is expected to either find the register's definition within the datasheet or use SmartRF Studio, a screenshot of which is below. The software is a Windows application for communicating with a radio through a serial port. It also allows an engineer to load preconfigured profiles for certain types of modulation, such as IEEE 802.15.4 or SimpliciTI. Once loaded, the settings may be tweaked and loaded into a packet-sniffer firmware for debugging projects.
The screenshot above, from SmartRF 6, can provide a decent idea of how infuriating the software might become. For product development, the register settings are to be copied and pasted into opaque source files. It can instill the same sort of anger as an uncooperative web form, even after the old Visual Studio 6 libraries have been located to make it work under Wine.
SmartRF 7--at Beta 0.9.1 as of this writing--is a new client under active development, one which seems to have been rewritten from scratch or at least significantly refactored. The old Win32 GUI has been replaced with QT4, making a future Linux or Mac port possible. Most importantly of all, however, is that the default configurations, as well as register explanations, are stored as XML.
The screenshot above shows how default configurations are chosen for the target radio. Users can select an example, then change any of the settings they like. These configurations are further abstracted into an "Easy Mode" which hides all the messy minutia of radio, abstracting use to standards and channel numbers.
These configuration settings are to be found as XML in the config/xml/ directory, organized by chip and presentation. The following is an example of one such configuration file for the CC1110.
Patching these configurations allows for the easy addition of new standards. For example, I added support for my Tennessee Belt Buckle radio by adding a new stanza to easymode_settings.xml.
The register definitions and their bitfields are also defined in XML. The following from register_definition.xml for the CC1110 described the FREQ2 register's meaning.
Printing SDCC special-function register definitions, such as those found in cc1110.h, is as easy as a bit of Python magic,
#!/usr/bin/python
# Simple script for dumping Chipcon register definitions to SDCC header.
# Only intended as a rough example.
# by Travis Goodspeed, Engineer of Superior Belt Buckles
import xml.dom.minidom
def get_dom(chip="cc1110",doc="register_definition.xml"):
fn="/opt/smartrf7/config/xml/%s/%s" % (chip,doc);
return xml.dom.minidom.parse(fn)
dom=get_dom();
for e in dom.getElementsByTagName("registerdefinition"):
for f in e.childNodes:
if f.localName=="DeviceName":
print "// %s=%s" % (f.localName,f.childNodes[0].nodeValue);
elif f.localName=="Register":
name="unknownreg";
address="0xdead";
description="";
for g in f.childNodes:
if g.localName=="Name":
name=g.childNodes[0].nodeValue;
elif g.localName=="Address":
address=g.childNodes[0].nodeValue;
elif g.localName=="Description":
if g.childNodes:
description=g.childNodes[0].nodeValue;
print "SFRX(%10s, %s); /* %50s */" % (name,address, description);
These configuration files should allow for new Chipcon radio applications and development tools to be written in short order. Please contact me if you would be so kind as to write a complete Python class for querying this data.
Subscribe to:
Posts (Atom)