We’ve been working on moving our listen storage from influx to timescale and in this process we’ve had a chance to clean up duplicate listens that have been plaguing us in our production database. There were three separate problems since the beginning of LB that caused duplicates in the database and we think we can avoid them in the future with timescale.
IMPORTANT notes before you look at the DB:
- We are not live updating this database.
- There are quite a few listens missing from the last few weeks – this is known and please do not report this as an error.
I’ve put up a test version of this database here:
To see what an example of listens we’ve removed, compare these two links:
Current production DB:
Dup removed test DB:
If you have listens duplicated in your stream, please take a look at your profile and see if the duplicates are gone and also see if other listens are missing that should be there (short of the listens from the last 4 weeks). If you find something that does not look right, please open a ticket in the ListenBrainz project.
Once we resolve issues around the dups, we’ll move to put the timescale DB into production.