Tuesday, December 30, 2008

Sniffing the MSP430 FET Protocol

by Travis Goodspeed <travis at radiantmachines.com>
regarding his independent recreation of anonymous, unpublished by other neighborly fellows. update: see here for the original implementation.

As the MSP430's gdbproxy relies upon closed-source libraries, which are not available for platforms other that Windows and i386 Linux, it would be valuable to generate an open-source alternative. Further, these closed libraries do not allow the debugging of all MSP430's. As reverse engineering the protocol by code review would be prohibitively complicated, grabbing serial traffic is an effective alternative. The following method allows for the dumping of serial frames for later analysis.

It is also a worthy goal to reverse engineer the proprietary aspects of TI's JTAG standard, but that is not the subject of this article. Here I will only investigate the protocol between a workstation and the MSP430 JTAG tool.

The simplest means of doing so is by using LD_PRELOAD to proxy--and print--calls to the write() and read() methods. To do so, I authored serspy.

Serspy works by proxying each call to the read() and write() system calls. For example, to trap the write() command,

//trap the write command
static ssize_t (*_write)(int fd, const void *buf, size_t count)=0;
int write(int fd, const void *buf, size_t count){
int num;

//This grabs a pointer to the original function.
if(!_write)
_write=(ssize_t (*) (int fd, const void *buf, size_t count)) dlsym(RTLD_NEXT,"write");

//Now really write.
num=_write(fd,buf,count);

//And log it.
logdataw(fd,buf,count);
return num;
}

The code above is a replacement for the write() function which uses dlsym() to request a pointer to the original write() function, to which it forwards the call before logging it. As msp430-gdbproxy doesn't link against libdl and it is a 32-bit application, LD_PRELOAD must be set to ``./serspy.so:/usr/lib32/libdl.so''. Further, serspy.so itself must be compiled with the -m32 switch on 64-bit workstations.

Consider an example transaction, such as
W  7e 01 01 16 07 7e
R 0c 00
R 01 02 00 00 01 00 50 9e 98 00 67 4b


Writes (W) from the workstation begin and end with 0x7e, which never appears within the request. An encoding method of some sort is used to remove them. Reads (R) from the FET device begin with a 16-bit, little endian length. Following this length are the bytes themselves.

Consider also the following Writes, all of which are fetches for memory.
x/h 0x0200
7e 0d 02 02 00 00 02 00 00 02 00 00 00 b0 17 7e
x/h 0xfc00
7e 0d 02 02 00 00 fc 00 00 02 00 00 00 c0 06 7e
x/h 0xfc02
7e 0d 02 02 00 02 fc 00 00 02 00 00 00 af 0d 7e

All addresses are found intact as little endian: "00 02" for 0x200, "00 fc" for 0xfc00, and "02 fc" for 0xfc02. That won't be true for bytes which contain illicit characters, as I'll demonstrate later. Also note that the final two bytes of each message vary drastically; these are most likely a checksum of some sort.

Regarding the checksum, there are two common possibilities. The first is a CRC16 checksum, while the latter is the XOR of all transmitted bytes. The MSP430's serial bootstrap loader uses the latter method, but it is easy to rule out here. As the examples above for fetching 0xfc00 and 0xfc02 differ by only one bit apart from the checksum, yet the checksums show no resemblance, this checksumming function must be more complicated. A solution to the checksumming problem will be presented in a later article.

Illicit characters are dealt with by escaping. Consider the following queries,
x/h 0xeeee
7e 0d 02 02 00 ee ee 00 00 02 00 00 00 5c ac 7e
x/h 0xee7e
7e 0d 02 02 00 7d 5e ee 00 00 02 00 00 00 c6 3c 7e
x/h 0xee7d
7e 0d 02 02 00 7d 5d ee 00 00 02 00 00 00 16 b6 7e
x/h 0xee7f
7e 0d 02 02 00 7f ee 00 00 02 00 00 00 79 bd 7e
x/h 0xee7c
7e 0d 02 02 00 7c ee 00 00 02 00 00 00 a9 37 7e


From this it can be seen that 0x7d is the escape character, and that 0x7d and 0x7e are the characters to be escaped. Each is escaped by following 0x7d with either 0x5e or 0x5d, taking the lesser nybble.

Performing a few more queries exposes the length field of the memory read, as queries for 2, 1, and 4 bytes yield
x/h 0xeeee
7e 0d 02 02 00 ee ee 00 00 02 00 00 00 5c ac 7e
x/b 0xeeee
7e 0d 02 02 00 ee ee 00 00 01 00 00 00 91 89 7e
x/w 0xeeee
7e 0d 02 02 00 ee ee 00 00 04 00 00 00 c6 e7 7e


The first byte after the frame-start is 0xd, which is also the first byte of the response. Taking this further, a command/response code can be discovered, which is the first byte of both the encapsulated request and the response. In the follwoing case, an examine query has the code 0x0d while the set query has a code of 0x0e.
set *0xffd0=0xdead
W 7e 0e 04 01 00 d0 ff 00 00 02 00 00 00 ad de 49 6e 7e
R 06 00
R 0e 00 00 00 9c 52
x/h 0xffd0
W 7e 0d 02 02 00 e0 ff 00 00 02 00 00 00 4d b6 7e
R 0c 00
R 0d 03 00 00 02 00 00 00 ff ff 03 b8


This series will be continued once the checksumming routine has been reimplemented, at which point a custom client may be written.

See this post for details on my implementation.