Sunday, June 9, 2013

Some results!

OK, admittedly, they're not very exciting results, and they don't tell me a whole lot that I didn't already know, but I've finally got my data into a kind of shape that allows me to analyze it.

My CAM structured database now has about 12 million Event records in it (12,383,032, to be precise, drawn from 18,764,699 lines of log files). From that, I have created a graph that show how long each resource is used (specifically, the number of downloads each day after it is downloaded).

Average downloads per item by day after posting. Click to embiggen.
What it's showing us that the vast bulk of the downloads of each resource happen in the first week after it's posted, (see that red spike at the start there - that's an average of 50 downloads for each resource in the day it's uploaded). In the second week there's an average of one download per day, and after three weeks the downloads pretty much drop to zero.

This is to be expected on a course built around a weekly timetable like this. Once the week is over, students no longer (a) need that resource any more, and (b) won't see it through the interface any more, since the links to the resource are in the previous week's section of the site, which they aren't visiting any more.

There is more detail in the following graph, which shows the download stats for all the student resources (each as a separate line with randomly generated colours):
Downloads per item by day after posting. Click to embiggen.

It's a lot messier and harder to read, but it's interesting in that there are two specific resources that stand out - one (represented by a red line) became popular at about 125 days after it was uploaded, and remained popular for about six weeks; and one (represented by a aqua-ish* line) became popular about 300 days after it was uploaded, and remained populat for about three weeks. I'm not sure what was special about these - whether the became relevant to the curriculum for some reason, or whether it was a UI change that I made that threw them into the limelight, prompting students to download them. But I think they warrant further investigation.


* The problem with randomly generated colours is that they don't lend themselves to easy naming. XKCD proved it.