GSoC 2018: Client Integration for ListenBrainz

listenbrainz
gsoc-2018
Tags: #<Tag:0x00007fcd7e012b30> #<Tag:0x00007fcd7e0129f0>

#1

ListenBrainz: Add client integration for media players such as VLC, rythmbox, etc

Personal Information

Project Details

Overview

Currently, most of the listens on ListenBrainz are imported from Last.FM. Users using Last.FM scrobblers have to import their listens periodically to ListenBrainz and this process is a bit tedious. Also, listens imported from Last.FM doesn’t have MBIDs attached for every listen. So, if files are tagged with MBIDs, it may or may not go to Last.FM(depends on the scrobbler used) and this may cause ambiguities in listen.

This project aims at creating scrobbler plugins for most of popular music players, so that users can directly push their listens to ListenBrainz. This will enable listens to be attached with file associated MBIDs(tagged using Picard) and decrease the possibility of any ambiguity in listen.

Here’s a block diagram showing how listens importing process looks like:

Here’s a block diagram showing how direct scrobbling will look like:

Goals

  • Creating scrobbling plugins that push listen data to ListenBrainz:

    Presently, there are scrobbling plugins for nearly every popular media player that scrobble listens to Last.FM but hardly any that scrobble to ListenBrainz. The idea is to create scrobble plugins that scrobble directly to ListenBrainz server using it’s native API.

  • Writing tests for these plugins:

    This would include writing unit tests to ensure that each module works as expected during each stage of development. This would also include writing integration tests to ensure plugins works as expected when integrated with the media player.

  • Documenting installation and usage of these plugins:

    This will be the last part of the project and will include writing documentation so that users can easily install ans use these plugins.

Implementation details

Each media player has it’s own unique plugin structure and hence, having different way of implementing a plugin. Below is the general idea of how a client will be implemented.

The client will use the ListenBrainz’s native API to ‘POST’ a JSON request to the ‘submit-listens’ endpoint. The API accepts the JSON format of following format:


{
  "listen_type": "single",
  "payload": [
      --- listen data here ---
  ]
}

The “listen_type” element defines the type of submission. This can have one of the below 3 values:

  • playing_now - This will be sent when a track stats playing.

  • single - This will be sent when the track is either half complete or 4 minutes past, whichever comes first.

  • import - This will be sent in case of cached listens. The cached listens will be sent in chunks of size not more than ‘MAX_LISTEN_SIZE’.

The “payload” element is an array of listens. A sample payload may look like:


{
  "listened_at": 1443521965,
  "track_metadata": {
    "additional_info": {
      "release_mbid": "bf9e91ea-8029-4a04-a26a-224e00a83266",
      "artist_mbids": [
        "db92a151-1ac2-438b-bc43-b82e149ddd50"
      ],
      "recording_mbid": "98255a8c-017a-4bc7-8dd6-1fa36124572b",
      "tags": [ "you", "just", "got", "rick rolled!"]
    },
    "artist_name": "Rick Astley",
    "track_name": "Never Gonna Give You Up",
    "release_name": "Whenever you need somebody"
  }
}

A minimal payload will include “listened_at”, “artist_name” and “track_name”. A simple of this may look like:


{
  "listened_at": 1443521965,
  "track_metadata": {
    "artist_name": "Rick Astley",
    "track_name": "Never Gonna Give You Up",
  }
}

Caching

This will be part of later half of the project. This will enable the client to work even when the system is unable to send listens to ListenBrainz server and instead of dropping these listens, they’ll be cached and will be sent later when the system is online and the client is able to send listens to the server.

When a track starts playing, a JSON with “listen_type” as “playing_now” will be sent and if some error occurs, it’ll be retried a few times and if the error persists, it’ll be dropped. When the track is halfway through, then a JSON with “listen_type” as “single” will be sent and if some error occurs, it’ll be retried a few times and if the error persists, it’ll be cached.

Now the cached listens, if not empty, will be tried to be sent after some fix period and will be sent in chunks of size not more than ‘MAX_LISTEN_SIZE’.

A general block diagram showing how the scrobble plugin will work without caching

With caching

Media player specific details

VLC

  • Language to be used: lua

  • Platforms: Linux, Windows, MacOS

MPC

  • Language to be used: C or C#

  • Platforms: Windows

Foobar2000

  • Language to be used: C or C#

  • Platforms: Windows

RythmBox

  • Language to be used: Python

  • Platforms: Linux

Kodi

  • Language to be used: Python

  • Platforms: Linux, Windows, MacOS

KMPlayer and WinAmp

  • Language to be used: C or C#

  • Platforms: Windows, MacOS

Banshee

  • Language to be used: Python

  • Platforms: Linux, Windows, MacOS

Possible extensions

I plan to do following work after the above work is complete:

  • Allow user to customise scrobbler - This will basically allow user to customise the behaviour of plugin by simply changing some parameters in settings.

  • Add some additional features - This will include adding an option to switch the scrobbler ON/OFF and an option to enable/disable caching and clearing cache.

  • Add scrobble plugins for more players - If the above media player plugins are done and there is some time, then I’ll try to add scrobble plugins to some more players.

Timeline

A broad timeline of the work is as follows:

Community Bonding (April 24 – May 13)

Spend this time in discussing design decisions with mentor. Will also spend time in getting familiar with plugin structures of media players. Will start some initial work some Python based plugins.

Phase 1 (May 14 – June 11)

I aim to complete the Python based plugins in this phase. This will also include testing these plugins so as to make sure that they work as expected. Will also start some initial work on adding caching to them in this period.

Phase 2 (June 12 – July 9)

This phase will include writing plugins for remaining media players and writing tests for them. Will also do some work on caching feature if there is above work is complete and still some time. Will also work on fixing bugs, cleaning codes.

Phase 3 (July 10 – August 6)

I aim to complete the caching portion of all plugins. Will also complete the documentation on installation and usage of these plugins. Also, start work on the optional ideas if there is time.

After GSoC

Will add clients to some other media players if requested. Will also work on fixing bugs and adding features to the ListenBrainz server.

Here is a more detailed week-by-week timeline of 13 week coding period:

  • Week 1 (May 14 - May 20): Begin understanding the plugin structure of media players with Python based plugins. Also finalise the file system and libraries to be used.
  • Week 2 (May 21 - May 27): Start working on plugins and try to complete the initial working structure of plugin.
  • Week 3 (May 28 - June 3): Complete with Python based plugins. Trackdown and fix bugs and do some testing.
  • Week 4 (June 4 - June 10): Do some work on caching for Python based plugins.

First evaluations

  • Week 5 (June 11 - June 18): Start work on remaining plugins(lua and C/C# based) and try to complete the initial working structure.
  • Week 6 (June 18 - June 24): Complete VLC plugin (lua based) with caching. Fix bugs and some code cleanup.
  • Week 7 (June 25 - July 1): Complete the initial structure of remaining plugins.
  • Week 8 (July 2 - July 8): Write tests on the work done so far. Do some code clean up and fix bugs.

Second evaluations

  • Week 9 (July 9 - July 15): Start work on caching feature and try to complete it for Python based plugins.
  • Week 10 (July 16 - July 22): Work on caching feature for remaining plugins.
  • Week 11 (July 23 - July 29): Complete work on caching and write tests for it.
  • Week 12 (July 30 - August 6): CUSHION WEEK: If behind on stuff, then catch up. If not, start working on documentation.
  • Week 13 (August 7 - August 14): Work on final submission and make sure that everything is okay. Will also complete the documentation.

Detailed Information about yourself

I am a second year CS undergrad at Indian Institute of Information Technology, Una, India. I started working on ListenBrainz a couple of months back and the time period have been quite a learning experience for me. Here is a list of pull requests I’ve worked on over time.

Question: Tell us about the computer(s) you have available for working on your SoC project!

I have a Dell inspiron laptop with an i7 processor and 8 GB RAM, running Linux Mint 18.1.

Question: When did you first start programming?

I first came across C++ in my 10th grade and have been in love with programming since then. I picked up Python in my freshman year at college.

Question: What type of music do you listen to?

I mostly listen to rock and pop music. Some of my favorites are Queen, Green Day, Junoon and ColdPlay.

Question: What aspects of ListenBrainz interest you the most?

I love the idea of making listen data open which could allow creation of recommendation engines much easier and could be of great utility.

Question: Have you ever used MusicBrainz to tag your files?

Yes, I have used picard to tag my files.

Question: Have you contributed to other Open Source projects? If so, which projects and can we see some of your code?

I don’t have much history of contribution in Open source apart from ListenBrainz. Apart from this, I have done a few projects in my free time which can be found on my github.

Question: What sorts of programming projects have you done on your own time?

I have worked on 2048 game AI, a bot written in C++ and GUI in Python using Pygame.

Question: How much time do you have available, and how would you plan to use it?

I have holidays during most of the coding period and can work full time (45-50 hrs per week) on the project.

Question: Do you plan to have a job or study during the summer in conjunction with Summer of Code?

None, if selected for GSoC.


#2

This is an initial draft of my application. Any kind of suggestions and feedback are welcome! :slight_smile:


#3

Hi!
Thanks for the proposal. Here are some incomplete comments

Be careful about terminology here. Scrobbling refers to submitting data to last.fm. We shouldn’t use this term when talking about submitting listens to ListenBrainz.

It would be great to know more about these players. Here are some things that it would be nice to know:

  • Do they already have scrobbling support? Is it only for last.fm, or does it support libre.fm or other tools too?
  • Do they have “playing now” support?
  • Does it already have offline support?
  • Do you know the programming languages that the software uses?
  • Do you know how to submit patches to the project?
  • What does the existing scrobbling interface look like?
  • What would the interface for listenbrainz look like? (this could be just a paper drawing…)

What does this mean? What are the parameters? What behaviour could people change?


One final thing that interests me in this task:
Currently we use token authentication for submitting data to ListenBrainz - this is a secret value unique to each person. A more robust way of doing authentication would be to allow users to use oauth to connect their account to the application. This has a number of advantages:

  • We can see statistics of what applications are sending data (and application developers can see this too)
  • ListenBrainz can block bad applications
  • Users can prevent a particular application from sending data without blocking all (currently you need to reset your key)
  • It’s a good base to allow us to expand API usage to other clients to get different data in the future.

This is definitely a large extension to a project like this, and would require discussion with other MetaBrainz projects. For example, CritiqueBrainz already acts as an oauth provider, as does MusicBrainz (it’s how we create a ListenBrainz account). There was also additional discussion about changing the way that we do auth in MetaBrainz projects, which could impact the way we approach this task.

The reason that I’m pushing this now is that I can imagine that if all software uses token auth, it will be more difficult to push everything to use oauth in the future. Making oauth available at the beginning will make it easier to tell developers to use it from the beginning.


#4

Hi @alastairp!
Thanks a lot for the review :grinning:.
I like the idea of adding an OAuth support and would discuss this on the IRC and update the proposal accordingly.