at the Extreme Measurement Communications Center
of the Oak Ridge National Laboratory
Recently, I released MSP430static, a tool for reverse engineering MSP430 firmware. This article is a tutorial on the installation and usage of the tool.
Installation
As a development tool under active development, msp430static is very rarely distributed in packaged form. Just grab the latest code from subversion, like so:
mil% svn co https://msp430static.svn.sourceforge.net/svnroot/msp430static msp430staticOnce the code has been checked out, install it like so:
mil% cd msp430static/trunkWhen installing from the trunk version, links--rather than copies--will be made. This ensures that upgrading is as simple as running 'svn update'.
mil% sudo make install
ln -s `pwd`/msp430static.pl /usr/local/bin/msp430static
ln -s `pwd`/msp430static.pl /usr/local/bin/m4s
mil%
Once the application is installed, it'll likely be necessary to chase prerequisites. You will need mspgcc. Run 'm4s' to see the error message, then follow the installation procedures for your operating system. The following example is from a Gentoo machine.
mil% m4sOnce you've chased all of the absolute prereqs, the command with no arguments should merely return. At this point, you can begin to play with the tool.
install_driver(SQLite) failed: Can't locate DBD/SQLite.pm in @INC (@INC contains: /etc/perl
/usr/lib/perl5/site_perl/5.8.6/i686-linux /usr/lib/perl5/site_perl/5.8.6 /usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.6/i686-linux /usr/lib/perl5/vendor_perl/5.8.6 /usr/lib/perl5/vendor_perl
/usr/lib/perl5/5.8.6/i686-linux /usr/lib/perl5/5.8.6 /usr/local/lib/site_perl .) at (eval 2) line 3.
Perhaps the DBD::SQLite perl module hasn't been fully installed,
or perhaps the capitalisation of 'SQLite' isn't right.
Available drivers: DBM, ExampleP, File, Proxy, Sponge, mysql.
at /usr/local/bin/m4s line 297
mil% sudo emerge dev-perl/DBD-SQLite
...
mil% m4s
mil%
Organization
Before we use the tool, it might be helpful to explain the manner in which it is used. MSP430static is built as a Perl script which wraps an SQLite3 database. It is called either from the unix shell or from its own shell, and a lot of work can be performed by macros and subs. Macros are short, parameter-less blocks of code written in Perl, shell script, or SQL. Subs are SQL functions which are written in perl, but other languages will be added soon. (I'm writing this before feature-creep sets in. Check the formal documentation for whatever is new.)
The database file resides in the current working directory and is always named 430static.db. The database contains tables (code,funcs,symbols) for managing a working codebase. Other tables (lib) store library functions for symbol identification. Others (macros, subs) store macros and subroutines which extend the command language of the interpreter.
Usage
Let's begin analyzing some code. In my case, I'll begin my database with the TinyOS Blink example.
mil% msp430-objdump -D /opt/tinyos-2.x/apps/Blink/build/telosb/main.exe | m4s initThis initializes the database with the dumped executable code, then calls a database summary. As this example includes symbol names, I need no library functions. Just to show off the library features, let's see which functions from libc exist here.
mil% m4s .summary
/home/travis/svn/msp430static/trunk/msp430static.pl
1000 instructions
33 functions from 1100 to 4a64
0 of 0 library functions found
73 distinct memory locations are poked.
0 lib functions.
0 unique lib function names.
0 unique lib function checksums.
mil
mil% m4s .lib.import.gnu >>/dev/nullNow that we know a function has been found, let's take a look at it.
mil% m4s .summary
/home/travis/svn/msp430static/trunk/msp430static.pl
1000 instructions
33 functions from 1100 to 4a64
1 of 2099 library functions found
73 distinct memory locations are poked.
2099 lib functions.
587 unique lib function names.
543 unique lib function checksums.
mil%
mil% m4s shellUsing the msp430static shell in the example above, the first query lists all recognized functions and their hexadecimal address. The second grabs the code of that function. The third grabs the symbol name that shipped with the executable. It's clear that this recognition is merely coincidence, Msp430TimerP$1$Event$default$fired is mistaken for __clear_cache as both perform the exact same thing: A simple return. Let's drop the GNU stuff and load only the TinyOS files.
m4s sql> .funcs.inlibs
4118 __clear_cache
m4s sql> select asm from funcs where address=dehex('4118');
4118: 30 41 ret
m4s sql> select name from funcs where address=dehex('4118');
Msp430TimerP$1$Event$default$fired
m4s sql>
mil% m4s shellThis time, many more functions are identified. If the Blink executable still remains, they ought to all be recognized. Supposing that you sell a proprietary library for the MSP430, msp430static makes it trivial to catch copyright violators. By running loading suspect firmware and calling the .funcs.inlibs macro, in seconds you can determine whether your library is being used.
m4s sql> delete from lib;
m4s sql> lib.import.tinyos
...
#887 lib functions.
#213 unique lib function names.
#213 unique lib function checksums.
m4s sql> .funcs.inlibs
403a __ctors_end
403e _unexpected_
404c __nesc_atomic_start
4060 __nesc_atomic_end
4084 Msp430TimerCapComP$0$Event$fired
4094 Msp430TimerCapComP$1$Event$fired
40a4 Msp430TimerCapComP$2$Event$fired
4118 Msp430TimerP$1$Event$default$fired
43a4 Msp430TimerP$1$Timer$get
454c Msp430ClockP$set_dco_calib
4568 MotePlatformC$TOSH_FLASH_M25P_DP_bit
4118 SchedulerBasicP$TaskBasic$default$runTask
4a62 __stop_progExec__
41da SchedulerBasicP$TaskBasic$postTask
432e TransformCounterC$0$Counter$get
473e TransformAlarmC$0$Alarm$startAt
477c AlarmToTimerC$0$fired$runTask
47c0 VirtualizeTimerC$0$Timer$startPeriodic
4986 SchedulerBasicP$TaskBasic$runTask
499e McuSleepC$getPowerState
m4s sql>
Let's try things from the other side of the fence, though. Suppose you have a firmware that you'd like to reverse engineer. How much can we determine without symbol names? A callgraph, such as this one, is easy enough to generate by the .callgraph.* macros. .callgraph.xview or .callgraph.kgv will display the graph, and my PDF was generated by the following:
karen% m4s .callgraph.ps >foo.ps
karen% ps2pdf foo.ps
karen%
Supposing I wanted to see what an attacker could determine of my application, knowing only the standard libraries but nothing of my source code, I could run the following:
karen% m4s shellThe callgraph, available here, shows that some, but not all, of the functions are identified. (In practice, the example projects ought to be completely identified. Any private functions, not imported by the script, will not be shown.)
m4s sql> update funcs set name='unknown';
m4s sql> .symbols.recover
m4s sql>
karen%
It's also important to note that function inlining can make fingerprinting difficult. As TinyOS inlines functions by default, the same function might be inlined in one example and not in another. (At present, inlined functions cannot be automatically recognized.)
Macros and Subs
I haven't room here to enumerate all the features of msp430static, but luckily it will enumerate them for you. The macro .macros will list all macro names and comments. .subs will list all subroutines and comments.
karen% m4s .macrosThese are just rows in the database, so new subs and macros may be written from SQL. The source code to a sub or macro may be called by a SELECT statement. (A few of these call functions in msp430static.pl.)
.callgraph Dump a digraph call tree for graphviz.
.callgraph.gv View a callgraph in ghostview.
.callgraph.kgv View a callgraph in kghostview.
.callgraph.lp Print callgraph for US Letter.
.callgraph.ps Postscript callgraph, sized for US Letter.
.callgraph.xview View a callgraph in xview.
.code.switches List branches belonging to jump-table switch statements.
.export.aout Dumps the project an a.out executable.
.export.ihex Dumps the project as an Intel Hex file.
.export.srec Dumps the project as a Motorolla SRec file.
.funcs.inlibs List functions which appear in libraries.
.funcs.outside List instructions where are not part of any function.
.funcs.overlap List overlapping function addresses.
.lib.import.gnu Import mspgcc libraries from /usr/local/msp430.
.lib.import.tinyos Import mspgcc libraries from /usr/local/msp430.
.macros Lists all available macros.
.memmap.gd.eog View a callgraph in Eye of Gnome.
.memmap.gd.gif Output a GIF drawing of memory.
.memmap.gd.jpeg Output a JPEG drawing of memory.
.memmap.gd.png Output a PNG drawing of memory.
.memmap.gd.xview View a callgraph in xview.
.memmap.pstricks Output a LaTeX drawing of memory.
.missing Default macro, run whenever a missing macro is called.
.subs Lists all additional SQL functions.
.summary Output a summary of the database contents.
.symbols.recover Recover symbol names from libraries.
karen% m4s .subs
addr2func Returns the starting address of the function containing the given address.
addr2funcname Returns the name of the function containing the given address.
callgraph Returns a graphviz callgraph.
dehex Converts a hex string to a numeral.
enhex Converts a numeral to a hex string.
fprint Position-invariant fingerprint of an assembly code string.
to_ihex Returns a line of code as an Intel Hex entry. [broken]
karen%
In the following example, I add a new macro function which lists the functions which have not been identified in the library.
m4s sql> select * from macros where name like '.funcs.inlibs';Macros may be written in perl, sql, or unix shellscript. Subs work similarly, but there's only perl support at the moment.
.funcs.inlibs
sql
List functions which appear in libraries.
select distinct enhex(f.address), l.name from lib l,funcs f where f.checksum=l.checksum;
m4s sql> select distinct enhex(f.address),f.name from funcs f where
f.checksum not in (select checksum from lib);
43dc unknown
411a unknown
4230 unknown
48fe unknown
4810 unknown
4580 unknown
45ce unknown
4686 unknown
4886 unknown
40b4 unknown
4068 unknown
43b8 unknown
40fa unknown
4000 unknown
m4s sql> insert into macros values('.funcs.notinlibs',
'sql',
'List functions which do not appear in libraries.',
'select distinct enhex(f.address),f.name from funcs f where
f.checksum not in (select checksum from lib);';
m4s sql> .funcs.notinlibs
43dc unknown
411a unknown
4230 unknown
48fe unknown
4810 unknown
4580 unknown
45ce unknown
4686 unknown
4886 unknown
40b4 unknown
4068 unknown
43b8 unknown
40fa unknown
4000 unknown
m4s sql>
Further Usage
This ends the tutorial, but you should play around with the macros and subroutines further. Try writing a few of your own, and email them to me if they're interesting.