Accuracy / explanation for Acousticbrainz entry

Tags: #<Tag:0x00007f0508d99158>


Could please someone explain the meaning for this entry in Acousticbrainz:
If you hear and see the video, I don’t understand how this song get
SMIR04 Rhythm: "ChaChaCha"
Dortmund model: "electronic"
Party: "party"
Sad: “not sad”

Any ideas how this values will be calculated?


Yes, for a lot of my music this data is as far off as this example. From my limited understanding as a user without any deeper audio analysis knowledge, the algorithms used to get the data where trained on a limited data set. That’s especially clear with the genres, were there are different models applied to deduce the genre, see also the blog entry at and the comments there.

As I understood it, part of the goals with AcousticBrainz is to apply the algorithms and models to a larger data set to allow researchers and other interested parties to analyse the results and improve upon it.

The documentation of the Essentia toolkit provides also some background:

I hope this is about right, but some people involved in AcousticBrainz can for sure give a more in depth answer and correct me :slight_smile: