Looking at kindle reading habits with python and R

I bought a kindle in January 2014, and I love it. One of the main things I love about it, is that whenever i finish a book, I can just go to amazon and buy whatever I feel like reading. I find this is vastly preferable to having to have a bunch of books on a ‘to-read’ shelf, most of which never get read and which ironically leads to a feeling of having nothing interesting to read.

This habit (of buying a book just as I finish the previous book), means I can do something interesting (to me anyway). I can download my kindle purchase history and parse it and plot it to look at how long I spent reading each book. Using the amazon API I can also query their database to find out how many pages are in each book, so I can calculate which books were the biggest page turners.

So, without further ado, here are the results.

Screen Shot 2016-03-26 at 18.03.57

This is the timeline of all the books I have read on my kindle. Books I didn’t finish (life is too short for bad books) are coloured red. Firstly, can see that Morrisey’s autobiography is a stand out slow reader, which I also didn’t finish. There are only so many weird stories about the yorkshire moors i’m willing to read.

Screen Shot 2016-03-26 at 18.03.39

Secondly, how many books did I read per month? Spring 2014 was slooooww (thanks Morrisey), but I read 5 books in a couple of months.

Screen Shot 2016-03-26 at 18.04.26

Finally, the books I read ordered by how many pages i read per day. I think some of the top ones here are artefacts, I remember devouring Perfume in a short period of time, but H is for Hawk was definitely a slower read. The only way I have to figure out when i finished reading it, is the date of my next kindle purchase. I don’t tend to buy books that I’m not about to read, but I guess this happened a couple of times.

How to

  1. First up is some very low tech web scraping, go to your amazon account, your account -> View kindle orders -> copy and paste the text from that page. It should look like this.
  2. Open up this script and replace the /path/to/ bit with the path to the text file from step 1
  3. Run the script
  4. Load the data into this R script (has to be hard coded unfortunately) and run the script.
  5. Tada!

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s