I’m playing around with the Timescale DB to see if we can replace influxdb – so far the results are very encouraging. To see how well this new database performs, I decided to query all listens and plot them against the time of day when we recorded the listen and I got this graph:
All of this data is at UTC, since we do not have timezones available for where the listen occurred. I’m struck by how much this data looks like a sine wave.
Anyways, we’re working on stats for LB, so I thought I would share this random graph.
I can try to look at that. There was one timeslot that has some 200,000 more listens that any other – clearly a data anomaly, so I’ll be looking at that and see what caused that.
Thankfully, the potential move to timescale (from influx) gives me the chance to do some cleanup on the data. We will already need to sort all the listens before the import into timescale – this makes it trivial to remove duplicates and the slightly fuzzy duplicates that people have gotten from importing their data from last.fm.