Hello, I am Sweta Vooda currently pursuing BTech 2nd year.
I am almost ready to submit my GSOC proposal for Storage for AcousticBrainz v2 data
I need some clarification regarding a few points to make sure I am heading to the right point to finish my proposal.
In the requirements mentioned for this project.
- The third point it says “Update the client software to include a check where they announce to the server what version they are”
After checking the client-side code it is clear that the client doesn’t add any extra version or fields but sends the data straight from the extractor output to the server.
The “process” function in the acousticbrainz.py (in client) stores output from the extractor into the “tmpname” json file and sends it straight to the server after some checks.
It is also clear that the client is not setting any extractor version, the json file posted with low-level data has extractor version.
{u'essentia': u'2.1-beta2', u'extractor': u'music 1.0', u'essentia_build_sha': u'70f2e5ece6736b2c40cc944ad0e695b16b925413', u'essentia_git_sha': u'v2.1_beta2'}
Then why do we need to send it again?
-
Do you want explicitly send one specific data version attribute and set that to extractor version instead of sending in as part of version json structure which has version details of other tools as well?
(Or)
-
Do you want it to be version number that is there in PKG-INFO of abzsubmit’s package info?
I assume it is #1 above, please clarify/confirm.
- Next, In the 1st point “Update the database schema to include a data version field, and allow the Submit and Read methods to switch between them.”
There already exists a version table in which the “data” field stores all versions in the form of a string.
{u'essentia': u'2.1-beta2', u'extractor': u'music 1.0', u'essentia_build_sha': u'70f2e5ece6736b2c40cc944ad0e695b16b925413', u'essentia_git_sha': u'v2.1_beta2'}
So can we not use it itself to identify versions? or should we store the extractor version separately in another table to make it easy to query (read and submit ) the version and to maintain lowlevel_json data for various versions?
- Lastly, there is a lot of ambiguity regarding the 2nd point “Update the frontend including the dataset editor”
As far as I have understood it should be
-
Able to show different versions through API or server for the same mbids.
-
allow user to select version of low-level data that he wants to evaluate.
-
final output should also contain a version field of extractor used while displaying the low and high-level data corresponding to the recording.
@alastairp It would be helpful if you could clarify these points and lead me to the right way on submitting my proposal soon.