DJ promo releases

I thought this as well, but if you look above, this is proven untrue. If it is specifically meant to “produce the same AcoustID for lossy audio”, it does not do this. The release noted in that portion is proof of this.

Honestly, I like how AcoustID is working, now that I am testing it myself. It is quite nice, and I actually would like a tool to generate this ID, in GUI, locally, to generate these myself. I know there is a tool, a fingerprint GUI, but it only generates them to submit, not save local. I might need to create this myself.

I mean not to argue, I am simply discussing what I am finding in the process of trying to add these releases. What I have discovered is not at all what I was expecting and understood to be true.

I am still trying to think and wrap my head around all this, so any input is welcome. Just because I am generating results does not mean that these results are “scientific”. I am trying now to determine what exactly causes the fingerprinting process to see a WAV different from MP3. The files in the initial post are both DJ supplied, WAV used to make MP3. The MP3 is of reasonable quality, only using the soft 16kHz filter. If it used the hard filter, I would be more confident as to why the difference. I know that @IvanDobsky commented above that in his looking, the two fingerprints looked nearly identical, only small variation. If I recall correctly, the artist who produced this release uses Serato, so quality of product should be there.

Again, I mean no disrespect here. I am simply doing what I said I was going to do, try to add some of these releases. Given that, I can only share what I see in the process.

I will be using the following parameters for MP3 creation from WAV, using LAME encoder, next:

-m j -V 4 -q 3 -lowpass 20.5

That should produce good sample for testing.

It does in many cases. There is always that threshold where things start to become just different enough to be considered different. But that’s exactly why you can attach multiple AcoustIDs to a single MB recording.

And you can’t really say each different AcoustID gets a different MB recording if it is only the format that’s different. That does not work with all the music people have ripped and try to identify. If I have a CD and I rip this as 128kbps MP3 maybe this might maybe produce a different AcoustID than a higher quality rip. But it still needs to be associated with the recording of that CD of course.

3 Likes

I 100% agree with your statement. This is why I am discussing what I am seeing and finding, and thinking it through. I also want to look at using a CD, 16 bit FLAC and a 24 bit FLAC as sources for the tests. This obviously is after confirming that the three sources are proper and not upsampled.

I am seeing something that defies what I was told and used as my understanding of the workings of this. It seems that you also have the same as I did at the start. Something in that original understanding is not correct though. Comparing a 320 MP3 to a WAV should not differentiate, if the assumption is correct. However, I have proof that it does and can. Now, my issue is how and why.

@outsidecontext If you do not mind, please have a look at my post on this above. I am also happy to share the files of the release. I only request that the sharing be private, it is not up to me whether to share public or not, so I err on safe. This comparison was not like the 128 MP3 to lossless as you describe, I also would expect a difference there.

For what it is worth, these are the versions I am using for test:

  1. LAME 3.100
  2. FLAC 1.3.4
  3. M4A - hard to say exactly, using the latest QAAC with CoreAudio from approx 1 year ago.

The others are sort of irrelevant, OPUS is not common for non web streaming, so that is out of scope.

EDIT: I wanted to add some other thoughts…

  • The encoder settings are mastering steps. They have great impact on the result. The settings are never a “one size fits all”, if you want the best you can get.
  • There is a (or can be a) tonal difference between the MP3 and M4A compression. I wonder if there is impact on AcoustID due to this. I am unsure.
  • Using CoreAudio to create M4A files will produce a different result than using say libfdk_aac. They are different M4A files, no different than MP3 vs M4A.
  • FLAC, while lossless, is not all the same. It depends on the source, and this applies to WAV as well. I can make a lossless at any point in the process, and lossless means only lossless to the source. Thus, the source is a critical component, which often causes a difference in a 16 and 24 bit FLAC.
  • I have seen CDs that are not true 16 bit audio, so they actually differ from a true 16 bit FLAC.
  • Editing iterations also have impact. If proper head room is not there, lossless starts to create loss in its result. This is a result of poor mastering, or a result of a limited source.
  • The human ear, for most of us is only up to approx 16kHz, the logic of the MP3 cutoff. However, what is not considered is the result of not having the higher cutoff. Simply chopping the audio at a freq is nasty and can be heard, thus the head room need. I would expect this to also impact the AcoustID.

For those who may not follow along with my words, I wanted to find a good picture of this issue, here is it:

You can see how head room (db) relates to the source, ie 16 vs 24 bit.

Noting… every iteration removes head room. It should also make clear that a sound difference can in fact be heard over a speaker between 16 and 24 bit audio.

1 Like

It’s definitely the exception, not the rule. But as I wrote above it’s all mathematics, and given enough difference you can of course reach the point where the difference is just beyond the threshold for two fingerprints to be considered different enough to get different AcoustIDs.

It can very likely also depend on the kind of audio. The lossy compression might introduce more artifacts in some audio than in other.

I took a couple of albums and tested. The majority of my music collection is sourced from CD and I have it then both as a FLAC file and a lossy file created from the FLAC. Lossy files are mostly MP3, some OGG. There are occasionally some lossy files I encoded ages ago with lower bitrate that I never bothered to re-encode in higher quality.

The majority of examples I tried produced the same AcoustID for the lossy and lossless file (both for OGG and MP3). I had two albums that where both encoded OGG files with only 160 kbps. There for 5 of 24 tracks I got different AcoustIDs.

One example is https://musicbrainz.org/recording/251aeb66-069f-4181-be78-9656ac9c716e . The AcoustIDs are:

FLAC: Track "08ec24a1-5259-491b-a302-592b14b14851" | AcoustID
OGG: Track "b9aa51ef-6435-4a4c-8ccb-57ba12bc2af4" | AcoustID

Comparing the two most used fingerprints for each of those AcoustIDs shows they are still really similar: Compare fingerprints #37860991 and #79994161 | AcoustID

Another one https://musicbrainz.org/recording/60e94685-0481-4d3d-bd84-11c389d9b2a5 :

FLAC: Track "abc2c313-0230-47a1-9d1e-e1d75ac7de95" | AcoustID
OGG: Track "9cf386ac-8977-4829-ae50-70f52639bab0" | AcoustID
Comparison: Compare fingerprints #43461792 and #10544809 | AcoustID

But fingerprint lookup still worked for both the OGG and FLAC file because both AcoustIDs are linked to the relevant recording. That means there is the automatic, algorithmic way of deciding whether two audio can be considered the same recording, but given enough difference due to mastering / encoding this can fail. The linking to MusicBrainz recordings as a manual step can take care of that.

I think my point just is that if you want to use AcoustIDs to tell apart different encodings of the same source audio you are looking at the wrong tool, because it was specifically designed to not do this. If it would be overly sensitive with small encoder changes and consider each differently encoded file as a different recording it would stop being useful for its intended use (identifying a users music collection).

2 Likes

Absolutely, this is not the thread for that discussion though. My example used MP3 and WAV, which are both basically standard. Might I share the files I have with you? I respect and appreciate the opinions others can offer, as the intent is not to be right or wrong, but to reach an acceptable conclusion.

Yes, I agree. However, if I have 10,000 numerical examples, changing one makes a different result, as mathematics is not forgiving.

I need to look at your examples, which I will do. I am not sure who uses OGG anymore, but I can generate and test them all the same.

Thank you kindly for providing such a detailed response. I would appreciate if you, and others, might help me identify the core “issue”.

1 Like

I am still unclear on what your issue is sorry!

Multiple AcoustID’s being created for the same track don’t harm the goal of matching your songs to a recording, as long as they are all attached to that recording in MB.

If AcoustIDs are matched to the wrong recording, then they should be un-matched.

2 Likes

It is all good friend. I am in a private chat now trying to sort this out. Again, I mean no ill will, I am trying to make sense out of what is in front of me. The issue is that what I (and it also seems you) have understood is not exactly correct. The AcoustID system is quite nice, so there must be some variable in here that I or others are missing.

What I understand is: a recording can have multiple DiscID’s, and anything that changes a waveform (if it goes outside certain thresholds) can cause a new DiscID, sometimes including transcoding a file.

This seems correct to me?

Yes, seems correct. I was not under the impression that a lossy file derived from a lossless file could cause a change. Especially if that process is done proper, meaning that the lossy file was done with top quality encoder settings.

It was a shock to me to see a WAV and a MP3, provided to me directly, generated different IDs. My question here is why? What is it that caused this? What I can say at this time is the point can be proven with real files.

I have a few ideas in my mind, things such as the type of audio. It is well possible that the compression of certain types of audio might change the waveform enough to cause this. For me personally, I see a major difference. The MP3, even with top settings, does not produce the same result (as measured) as the source lossless. The M4A however, can produce this under specific criteria, mainly the use of QAAC and a recent CoreAudio back. Opus is also capable, again as it relates to the waveform itself, the truncation, loss from compression, etc.

Please see I have no interest in a hypothetical debate. I have now provided the files in question for another set of eyes to see and review. It is very possible I am missing something, or have made some sort of error. All I care about here is to understand what I have in front of me, and why the end results in a different than expected result.

It is also worth noting… there is major compression here. The WAV file is 80MB, the MP3 is 9.6MB. That is a lot of loss of data. The duration is 3 minute and 46 second for reference.

I also want to add that yes, I can hear the difference. I am not using anything special, but a cheap set of Numark HF125 headphones. Great headphones, but not at all top line.

I am starting to think this is an uphill battle with no end…

I am looking at another collection, DMC Commercial Collection 422. I can share more detail if anyone (or all) have interest, but I will summarize only here.

I am finding that trying to determine a name and artist for a recording is difficult to define many times. Even when it comes to an artist… Example, URBANHEADZ… they list themselves online as:

  • URBANHEADS
  • Urban Headz
  • Urbanheadz
  • UrbanHeadz
  • TheUrbanHeadz
  • etc

No need list all, you get the idea. I have no idea how to determine what is “MB proper” here. For this example, on the release, the DJ artists are all in all caps (URBANHEADZ) and the artists of the material being mixed are in “standard” caps (Destiny’s Child, Cassie). However, although this is sort of in violation of MB policy, I would not rely on the artwork on these releases for true accuracy. Example, this release has typos even on things like “Vs” being printed as “Js”. While MB would normally use that typo as that is how it was released, my point is that the printing, while generally accurate, cannot really be used for style.

Another example is the title vs artist fields. On this release, I have a recording titled “Camila Cabello Fe. Young Thug & Friends”. The artist being DJ Martin Pieters. Now, the actual recordings here are “Set Fire to the Rain”, “Havana”, “What Lovers Do” and others. This differs from another recording on the release titled “Waiting On Beautiful (Walking In Vain Vs Perfect)”, by “Bob Marley & The Wailers Vs Ed Sheeran”, mixed by “BERGWALL”.

If there is any interest in trying to make this MB friendly, that would be great. For my purposes, this is all just fine as these recordings are special purpose. They are also however great for casual listening as it brings a nice variation to what you normally hear on the radio or other.

for capitalization and the correct artist name, my general rules of thumb (in order):

  1. use whatever they use on social media (Twitter, Facebook, the artist’s Bandcamp, SoundCloud, etc.), if applicable. for example, a few recent artist name edits I’ve done (1, 2, and 3). obviously, if they’ve got a silly name on Twitter, like deadmau5 has right now (all hail the goat lord), don’t use that… :wink:
  2. if they have a website, you might be able to get a spelling from a bio or something. either an artist-created page or a label’s page could work.
  3. if they’re consistent on official releases by the artist, I might use that (probably very rare). I know @aerozol did this for PIG//CONTROL just recently (also probably from their Bandcamp)
  4. if you can find them in another database (i.e. Discogs, RateYourMusic, Wikipedia/Wikidata, etc.), you could use the same name they use there.
  5. otherwise, you’re probably left with trying to figure the most common spelling with whatever you’ve got.

a couple helpful pages from the Style Guideline: Titles and Artist

of course whatever artist name you choose, be sure to add the others as aliases, that way others can find them with other spellings. I don’t think they’re case-sensitive, so you wouldn’t need to add “UrbanHeadz”, “URBANHEADZ”, and “Urbanheadz”, and I also think the search ignores punctuation. I believe all the examples you gave above would be “artist name” aliases, as “search hint” aliases are for stuff like misspellings, alternate character encodings, and the like. see also the docs on Aliases.


I’m not sure if I understand the second part of your question, about the title vs artist fields… are you talking about a mashup release where the mashups are credited differently? (some have the original titles in the name and some don’t?)

a pic or two would be helpful in this case, if you’re able~

2 Likes






1 Like

okay, sweet~

I can’t quite tell from the pics if these are mashups or DJ-mixes (probably a mix of both), so I’ll answer both those questions.

mash-ups: artist credit would likely be the mashupper. for example, take The Birthday Girl vs. the Internet by Triple-Q.

therefore, if I read it correctly, disc 1 track 2 would be “Waiting on Beautiful” by “Bergwall”. there is a mashup guideline to add track artists to the recording name (not the track name), but I don’t usually see this actually being done.

DJ-mixes: I just found a seemingly similar release: Global Underground 024: Nick Warren in Reykjavik by Nick Warren. it seems you might credit the original artists in this case. if the track has a title, you can add that before the list of songs. for example, disc 1 track 1 could be:

“Powerpop 2018 (part 1): I Know You (Sultan & Shepard remix) / This is Me (Dave Aude remix) / Tip Toe (Wideboys remix) / Leave a Light On (Offset remix) / Feel it Still (Coldabank remix) / For You (Sam Ourt remix) / Barking (R3wire Vs Alphalove remix) / Strangers (Franky Rizardo remix) / Fine Line (James Hype remix)”

and the track artist would be:

“Craig David feat. Bastille / Keala Settle / Jason Derulo feat. French Montana / Tom Walker / Portugal. The Man / Liam Payne & Rita Ora / Ramz / Sigrid / Mabel & Not3s”

the slashes are according to the Titles guideline I posted above. (I know, it’s a bit of a mouthful, lol)

bonus - remixes: I see a few remixes on there too, and in this case those should be credited to the original artist. the remixer can be added as a relationship.

for example, disc 1 track 8 would be titled “Oh, Pretty Woman (DMC 2018 remix)” by “Roy Orbison”, and of course, add DJ Ivan Santana as remixer :wink:

note: you could also standardize “Fe” to “feat.”, provided that’s what it stands for. I know some Picard scripts probably rely on feat. standardization (not 100% sure on that though…). that’s entirely your decision tho~

also, it looks like the Js typo you mentioned might just be part of Roaxx J’s name, lol


now that I see the release, I think you can ignore the ALL CAPS name of UrbanHeadz from this release, as that seems to be a stylistic choice by the designers, considering peoples’ names are ALL CAPSed here too. based on what you said earlier, I’d credit them as UrbanHeadz for this release. entirely up to your judgement tho~

2 Likes

Well, I would have to say a combination of both, agreeing with you. However, I do want to add that I have recently been told that a “DJ-Mix” is for a release that is a non stop mix, and in this case, it is not. The mixing is limited to track by track… although that is not the case on all of such releases like this.

Normally I do agree, and considered, the page long title you noted, I think it is excessive, and, the release itself does not name the recording that. However, I understand this is not a normal case, just trying to think it through. The name I have for CD1 TR1 is:

  • “ALLSTAR - Powerpop 2018 (Part 1)”.

and CD1 TR8:

  • “Roy Orbison - Oh Pretty Woman (DMC 2018 Remix)”, which matches your thought as well.

Yes, in this case “Fe” foes mean “feat”.

Yes, you are correct on the typo. I missed this in the metadata, corrected now.

The caps is a standard in what I usually see, where DJs (or the designers) publish their names in all caps on releases.

Thanks for sharing your points. It seems you are suggesting I try the add and ask how it turned out? One think that sticks in my mind is the names the release comes with is designed for the user of the release, whereas the details on the back differs from the front as it adds further detail. In MB, the logic is more opposite, highlighting the actual artists vs the DJ.

I have a few DJ Rectangle releases that were far easier than this. I could simply credit the release to DJ Rectangle and have a “track list” of the artist and title of the songs as they appear in the mix. In this case, it would require subtracks. Although subtracks (index points) are perfectly within standards, MB does not support that. That is a different topic though, but it is worth noting it is in the Red Book Spec and is used quite often, CD and digital.

EDIT: I wanted to add an example of subtracks on such releases, on this Discogs reference to a different but similar release:
https://www.discogs.com/release/8031830-Bowie-I-Love-Bowie-Classic-Mixes-Volume-1

If there is a way to do so, I can also share the listing I have, along with the metadata. Although for these release, the metadata is not as it came with. It comes sometimes with none, incomplete, generic, etc metadata, so it is always changed, especially when using the files and you have your software write to the files vs a database.

I appreciate the detailed response and your time to look and respond.