Tuesday, April 13, 2010

A Cursory Look at Kindle Forensics

I recently purchased a Kindle which I have come to adore. It's one of those devices make it hard imagine what life was like before you purchased it. However, being the hopeless forensic geek that I am, I had to figure out what sort of forensics could be performed on the device. (No, I have no idea how I got someone to marry me. I really don't.)

I purchased the current generation Kindle with the 6" screen. This model provides the user the ability to plug the device into a computer via a USB port to interact with the device. Amazon accomplishes this ability by creating a 1.5GB portion of the device that is visible and accessible to the user as if it were a standard USB storage device.

From the research that I have conducted so far, it appears that you can treat the Kindle as you would any other USB storage device for imaging purposes. The best way to do it is to use the USB cable that Amazon provides for connecting the Kindle to a computer. You can then write block a Kindle like you would any USB device. For my research, I used a Tableau T8 USB Forensic Bridge and was able to make the image using EnCase without any problems.

I haven't spent much time on the analysis portion of this research. However, I can report that a Kindle USB Drive shows up as an mkdosfs\FAT32 situation. This makes sense given that the Kindle runs some sort of Linux OS that we can't see via this USB capture process.

There are some interesting artifacts of the low hanging fruit variety. For example, "userannotlog" file located in the system folder. It lists the last book that I read, what my position was in the book and it also includes clear text time stamp information that correlates with when I know I was reading the book in question. Very cool.

The "documents" folder, as you might expect, contains the actual content that I have on my Kindle. I don't have much on it right now, but each book has an .azw file which is the actual content of the book in a proprietary format and a .mdp file that...well, I don't know what it does at this point.

There is a "search indexes" folder in the system\search indexes folder path that, one assumes, keeps track of searching done on the device. I bought a wine book that I did a search for the word "Pinotage" (Sigh. Yes, add "wine geek" to my list of vices...) and I used that as a keyword for a search...and came up with nothing eventful. There were about 20 hits on the word, but all of them in the context of other words in that alphabetical range so nothing that would show that I searched for that word.

You'll find a lot of indexed words in the system\search indexes\Index.db What I'm seeing already is that there are three bytes before each word that are clearly meaningful. For example, the word "pinewood" is preceded by 0x740008. So what we have is the word "pinetorch" and then 0x740008 and then the word "Pinewood". I don't know what the 0x74 means or if it's associated with the word "pinetorch" or "pinewood", but the 0x08 is the length of the pinewood entry. It's probable that this length indicator actually uses two bytes which would make 0x0008 the bytes that indicate length. I'm seeing this behavior consistently in this index file where a word is preceded by byte(s) whose hex value correlates with the length of the word that comes after the byte(s). Interestingly, I'll see a block of words pretty close together and then one word will end with 0x7A instead of 0x74 and then there won't be anymore words until a new block starts again about 900 or so bytes later. Towards the end of this file, there is a listing of the books on the Kindle and the paths to their associated files.

There is also a reader preferences file in the system\com.amazon.ebook.booklet.reader\reader.pref location. It has a clear text time stamp that appears to correlate with the last time I used the Kindle. It also declares what preferences I'm using for a dictionary, the type of justification I'm using and the last book I read.

There's a white paper in here for someone somewhere.


  1. I'm not sure what the limitations are on sharing documents with the Kindle, but I could envision it being likened to a mass storage device. Who's to say that someone couldn't store proprietary info on it, or other files that could be considered contraband. Additionally, it seems like electronic data could be transported rather inconspicuously given that, to most people, it's just a book.

    So do you prefer the Kindle to traditional books from a reader's standpoint?

  2. Thanks for the comment, Crosser!

    The AZW format has some sort of DRM baked into it to prevent sharing. What I'm *guessing* is the case is that when you order a book from Amazon, the AZW file is embedded with your Amazon account ID which only allows that ID to use the content.

    You're spot on about Kindle being essentially just another USB storage device when connected to a computer. 1.5GB can store a lot of documents.

    I do prefer Kindle eBooks to physical books. However, in some cases, like something that is going to be heavily graphic in nature, I'll still probably opt for a physical book. I'll also continue to buy the physical version of a book when I won't be able to get a Kindle version in a timely manner. For example, I'm not waiting a year to buy Harlan's next book.

  3. Interesting. I came here trying to reverse engineer 2.5's Collections, and discovered instead what a USB Write Blocker is. So I assume the Kindle software wants to write to this file userannotlog, which I don't see otherwise, and your device stops the write?

    I see that Index.db is not what I thought it was. Interesting.

    So what I'm trying to figure out is collections. The only interface I have to collections is on the Kindle, and with over 1000 pdf documents, I'm not going to go through adding them one by one.

    Kindle creates a collections.json file (in javascript object notation) that lists the collection and which items are in it. It's visible without using a USB write blocker.

    Unfortunately it uses a 40-byte hash value, and I'm having trouble figuring out where that comes from. I've tried SHA1 on the file contents, the title of the file, and the full path to the file, but it doesn't match. Any ideas?

  4. Thanks for dropping by and commenting. What write blockers do whether they are software or hardware based is to prevent any writes from occurring to the target device when an examiner is making an image.

    I normally favor using hardware write blocking devices as a first choice, but there is a time and a place for write blocking via software methods.

    By reading what I posted, you now know all that I know when it comes to Kindle forensics. I don't think I'll find time to return to the issue in the near to mid future which is one of the reasons I created the blog post. I figured someone else might be working on it or it might inspire someone else to dig deeper.

  5. Thank you for the article. For the anonymous post above regarding reverse engineering collections. I'm in exactly the same boat as you. I found this article about the DX which is helpful, but it claims the hash is SHA-1 and is the document path relative to /mnt/us/documents. I've tried a relative path from my usb-mounted Kindle "documents" directory, but SHA-1 does not yield the same hash. I'm still looking into how to decode it.

  6. I have filled my kindle with many many books some of which I have read completely, some partially and some not at all. They are all jumbled together making it hard to find a book that I havent finished yet as this involves opening them individually.
    Where is the "furthest" read location data stored and also the location of the maximum length? If I could access these in a table for all the books it would make life much easier.

  7. Further question, please could you tell me how you manged to read what is in the index.db and other files. I have tried MS Access and am hoping to get to grips with PHP.
    I think if I can open some of the Kindle system files they amy contain the data I am looking for.

  8. I wish I could answer your first question, but pretty much everything I know about Kindle forensics was contained in the blog posts that I did on this subject. I haven't had time to revisit the topic for a more extensive review.

    I did the examination using EnCase, but you might be able to do some basic work using Access Data's FTK Imager tool which is a free download from Access Data's website.