Introducing the ListenBrainz Session Viewer

lucifer · August 26, 2022, 5:49pm

The ListenBrainz Session Viewer (Dataset hoster: sessions-viewer) allows users to view the listens of the given time period for a user distributed into sessions.

Rationale

To build recommendations and other music discovery features, we need multiple sources of data. One possible approach we are currently investigating is assigning similarity scores to recordings based on user listening histories.

To do this, we intend to group users’ listens into “sessions”. All the recordings for which we have listens in a particular session are considered to similar. Then, we aggregate the number of times various recording pairs occur across users and sessions. The aggregated count represents the similarity score of the recording pair.

There are many variations possible on this algorithm to refine the similarity scores. For instance, we could make the scores weighted giving higher preference to recordings which were listened close to each other than those far from each other in a session. We could also try to figure out if a recording was skipped and exclude it from the session so on.

However, calculating the actual similarities is a bit far. For now, we have built the sessions-viewer tool to explore user listening sessions, discover insights and assist us in working on the similarity calculation algorithm.

We would like you to use the tool and share your thoughts on it. We are also looking for feedback on use cases for this data. Also, feel free to share any other suggestions you have about calculating recording similarities. However, please do keep in mind that all of this is in an exploratory phase currently.

Description and Usage

Listens are considered to belong to same session if the time difference between any two consecutive listens in that set is less than a given threshold. The difference takes into consideration the duration of the recording listened. If the duration of the recording is unavailable in MB and the listen metadata, it is assumed to be 180s.

The viewer takes 4 inputs.

user_name: the ListenBrainz user name of the user you want to view sessions for
from_ts and to_ts: timestamps to specify the range of time in which listens are shown. These fields accept a UNIX timestamp. You can calculate these for the time period of your choice using https://www.epochconverter.com/ or any other website/tool of your choice.
The maximum allowed range for from_ts and to_ts is 30 days currently. If the user entered range spans more than 30 days, it will automatically be clipped to 30 days.
threshold: the minimum time difference between listens to demarcate a session in seconds.

Example Usage:
1.

2. lucifer’s listen session example

You can also load a session for playing on the ListenBrainz website (with some caveats) by clicking on the Open as Playlist button next to it.

Isabelxxx · August 26, 2022, 9:58pm

While I appreciate the work behind this I think the design is inherently flawed, the same than Spotify model…

People are not necessarily listening at all to related music in the same session. That’s a supposed listening behavior to retrieve data which in no way can be verified to be true.
I don’t see this approach better in any way to AcousticBrainz weakness, it’s the same. There is a degree of uncertainty and it’s unknown… one of the reasons to shut it down (?)

Then we have bias… many times listening sessions may consist in an entire album listening. There is no useful data to gather there (?) Neither all tracks from an album necessarily have the same genre/style, nor it’s a surprise the tracks are related in some way.

I think the results will be pretty predictable, with popular things being shown in listening sessions and predefined listening behaviors appearing again and again. The same than Spotify recommends more popular music first, so people listen to it more and then it gets more popular.

I get you are trying to find a new approach to music recommendations and similarity… but don’t think at all user listens should be the base gor that. That will make cross-cultural linking almost a curiosity, among other biases related to the way people listen to music (inherited by current models of music consumption). Really, machine models, genre classification and multiple points of data are the way to go. If you add in top of that user listens then great. Scrapping the music analysis of the equation will not bring anything new to the table.

MuLiO1 · August 27, 2022, 5:14am

Is it even useful to add user behaviour to gather data of music similiarity? There must be more people like me who like - or even need - significant contrast during one listening session. After listening to one of the nine symphonies of Beethoven my next choice could be an album of Rammstein (Industrial Metal) and after that some songs by Frank Sinatra. There is no similiarity in this music but a lot of contrast. To get similar music like Rammstein’s Industrial Metal a look into the genres of Metal should be the way to go.

yuioen · September 10, 2022, 7:39pm

Hello

This is exactly how I imagined the similarity creation between tracks. I think mixing similarities with tags and statistics analyses of the track names it could even be used to guess or create genres/mood classification o/

It’s very clever to consider song that are listened closed to be similar since it also avoid big sessions to create bad relations. Maybe you could define a threshold after which songs aren’t linked together anymore even if there are on the same session ?

But it doesn’t matter if its not true. We only need a few people to listen similar tracks in a session. Because the similarities will be calculated statistically, so tracks that are often listened together will be linked even if they are also listened in totally different contexts. I agree there might be biases concerning most famous songs, but its not sure how important they will be and maybe it can be resolved :

Applying a negative coefficients to top songs (they need to match a lot more to be considered to have a similarity, but idk how this will be calculated so not sure its the way to go).
Or maybe we could make “underground” playlists which represent song that are similar but without a lot of listenings.
Or we could shuffle the generated playlist ^^
But in any case I think this has to be tried.

The same than Spotify recommends more popular music first, so people listen to it more and then it gets more popular.

This can also be explained by their economic model…

I agree acoustic data could be useful but its not the subject here and this wonderful team saw it wasn’t worth the work for now and I trust them o/ But if you want to work on this I suppose the data is still available

I’m so exited to see this happen and what it will show us o/
May we can ask Glenn Mcdonald (from spotify) for some advices haha

yuioen · July 5, 2023, 1:50pm

hello

Is there news about this ?

I supposed its not he same problem but I think actual recommendations use all the user dataset and do not try to give a positive coefficient to more recent song. This lead to very diverse playlist (which is cool). Maybe another way to generate recommendation would be to use session but with a larger time laps (maybe weeks). This could be called periods. This way we could generate for example old songs playlist childhood playlist (only based on old period), recent discoveries (very short and recent period), recent mood (recent period, time laps of one year). Is there some forum post about this ?

lucifer · July 14, 2023, 9:19am

Hi!

I am not sure what you are looking for exactly, but yes we do have multiple more datasets available now to use in generating recommendations and playlists. Its all a bit alpha at this stage so no blog posts or proper docs. Please join the metabrainz IRC channel and ping lucifer or mayhem to discuss more about this.