Previously, I posted a visual of a single time-series against a global heatmap based upon the World Development Indicator dataset from the World Data Bank.
So I wondered, could I display all the series from this dataset? This is no small task as it consists of over 18 million datapoints spread across 1,345 series for 248 different countries and country aggregations.
Having had my site hacked a number of times, rather than worry about exposing a restful service to the internet, I decided to use an in-memory database. Sure the initial load will be slow, however, exploration itself would be fast.
Initially, it took me over a day to load the datapoints into sqlite. However, I didn’t get it right the first time. As Larry Wall once said, Impatience is an virtue of the developer. Impatient, I tried to figure out why my insertion rate into sqlite was only 211 entries per second. Turning autocommit off and committing every 1000 insertions increased my performance by roughly 4000 times. Once that change was made, my 24 hour load times became 3 minutes.
The schema I settled on looks like:
On my local machine, it takes around 8 seconds to initialize the application. From there, I can explore all 1,345 development statistics across time in a Gapminder like setting.
For a limited time, I will keep this link active so that others may also give it a try:
However, due to the fact that the sqlite database is over 500 megs (mostly due to indexing), I will likely take this offline in the near future.
-rwxrwxr-x+ 1 Patrick Patrick 423095296 Mar 6 19:49 wdi.db
Do not click this link from a mobile device! It will surely fail and expect it to take awhile to load! While the banner: “Please wait, the data is loading…” is displayed; THE DATA REALLY IS STILL LOADING.