Troi local resolution of recordings

yuioen · July 6, 2024, 8:07pm

hello

Took some time but I’m looking back at troi and if it make sens to implement the local resolver into funkwhale.

But there is something I don’t understand. When a patch is launched a db is created : db = SubsonicDatabase(db_file, config, quiet) then the patch will generate a playlist, the tracks from this playlists come from Listenbrainz servers. After that they are resolved and filtered against the troi database using ContentResolver. The content resolver work by using the db created before launching the patch.

Questions :

Do we agree that troi calls to listenbrainz api haven’t changed ? The content-resolver update is a set of tools that look from tracks that match the recording returned by lb, but the server side playlist generation don’t take into account the local db ?
The content-resolver is called two times, one in patchs.PeriodicJamsLocalPatch and another time in local.PeriodicJamsLocalPatch, why ?

thanks for this great update

rob · July 8, 2024, 9:43am

They haven’t been drastically overhauled, but there have been changes because LB and its API keep evolving. To really answer this question, you best check the commits for actual API changes.

Correct. We decided that troi should be all things playlists in the LB world, so there is support for global playlists, which need to be resolved before playing and support for playlists that are generated against a local collection. But they are independent, you don’t need to use both, you can choose one or the other.

Mostly because its poorly named – I should improve that. There are two distinct features here:

Take a global playlist generated by the troi peridoc jams and resolve it to local files, which often doesn’t have good results. Its mostly there as an example. (patchs.PeriodicJamsLocalPatch)
Generate a local periodic playlists (local.PeriodicJamsLocalPatch) that target a specific music collection and thus will not need further resolution. This is the proper way to generate local periodic jams playlists.

yuioen · July 10, 2024, 9:19am

Thanks for this explanations

Since you explained the api call to lb don’t send info about the local state, I don’t understand the difference between the two patches. How troi can generate a playlist that target the local music collection without sending information about the local music collection to lb servers ?

another topic : I want to allow users to use troi in funkwhale UI. Is it okey if i update the inputs functions with a description and a type for each patch aguments ?

yuioen · July 10, 2024, 9:34am

example :

        return [
            {"type": "argument", "args": ["area"], "kwargs": {"type": str, "help-text": "A MusicBrainz area from which to choose tracks"}},
            {"type": "argument", "args": ["start_year"], "kwargs": {"type": int, "help-text": "The start year"}},
            {"type": "argument", "args": ["end_year"], "kwargs": {"type": int, "help-text": "The start year"}}
        ]

rob · July 10, 2024, 9:54am

Sure, that makes the command line better too!

rob · July 10, 2024, 9:58am

Before the local periodic jams playlist can be generated, the user needs to scan their local MBID tagged collection (via filesystem or subsonic API) and then download more metadata for that collection. This is then stored in a SQLite DB – this enables a bunch of other features as well. (duplicate detection, unresolved recording reports, etc).

When a local periodic playlist gets generated, it first takes the global list of recommended recordings, resolves those against the local collection and then builds a playlist from it.

yuioen · July 10, 2024, 6:47pm

okey thanks for the explanations

Last question why top_missed_recordings, top_discoveries_for_year, and weekly_flashback_jams have been deleted ? seemed like cool patches o/

rob · July 10, 2024, 6:59pm

Yes, they’ve been deleted. Thinking that they’ve graduated is a better term, IMHO. They continue to exist, but are implemented in Apache Spark, which is great for scalability. Troi is great for experimentation, but not for scale, so our plan has always been to graduate patches that produce solid results up the chain.

yuioen · July 10, 2024, 8:08pm

I’m surprised, if troi is only for experimentation I don’t think I should bother to implement it on funkwhale… So I don’t understand why we’ve been working on this… If troi is not the good way to get playlist/recommendation from lb what is ?

Also I don’t understand : spark is for analytics, its not a way to give recommendation result to end user but a way to generated them server side ?

rob · July 10, 2024, 8:35pm

That statement is in context of the development pipeline for LB and how features must scale. Troi is great for small installations where you are generating a thousands of playlists a day, not millions a day. LB must scale considerably more than a funkwhale pod.

That is over simplifying it – think Spark as a weird SQL database that is blazingly fast when its runs huge batches of data. Such as the collaborative filtering algorithm for all LB users to produce the weekly (daily too) recommendation playlists for LB. What spark can do in a few minutes, troi would take all day.

yuioen · July 11, 2024, 8:39am

I understand spark is needed to generate recommendation for users, I suppose what change is that analytics are move from troi to spark. I also suppose when it’s done Lb api is update so the palylists can be accessed directly instead of being generated by troi.

But why delete the patch in troi when they could be updated ?

Also I don’t know if you recall this discussion about allowing custom endpoints for recommendations : Sort troi patches by stability and use - #22 by yuioen. It’s not a problem to keep this for later but I think we might want to sort the patch stability issue, because it will be a poor user experience if patchs disappear randomly.

We could also drop troi at funkwhale and only use lb api endpoint that are stable (even if this is only two endpoint : Recommendations — ListenBrainz 0.1.0 documentation). I just need to know since I already spend some time working on this and would like to avoid loosing it jaja

rob · July 11, 2024, 8:56am

The patches that disappeared randomly could never be run by end-users since they were designed for YIM use and relied on datasets that we only build in December. I’m pretty sure you were not using those scripts and even complained about dead scripts kicking around. So, we cleaned up and now you’re still complaining.

The scripts still exist in git. If you want to bring them back, then by all means do that in your own code. You’re not dependent on these scripts being in the main repo.

The custom endpoints for recommendations ended up being implemented as SearchServices that are availably globally and locally:

github.com

metabrainz/troi-recommendation-playground/blob/main/troi/recording_search_service.py

from abc import abstractmethod

import requests

from troi import Recording, Artist, ArtistCredit
from troi.service import Service
from troi.plist import plist


class RecordingSearchByTagService(Service):

    SLUG = "recording-search-by-tag"

    def __init__(self):
        super().__init__(self.SLUG)

    def search(self, tags, operator, pop_begin, pop_end, num_recordings):
        """
            Fetch the tag data from the LB API and return it as a dict.
        """

This file has been truncated. show original

yuioen · July 11, 2024, 1:26pm

I’m only asking for a reliable way to know which patchs are stable from which aren’t, so we can parse reliable patch and their arguments from troi to display them to the end users. This could be as simple has a having them in a different path. But has a maintainer I simply can’t maintain a library if it’s not stable, if experimental patches are pushed and release to the lib without a way to know they are experimental. And again it’s not a problem or a critic but I just need to know what you’re willing / planning to do with troi.

In a matter of fact since we planned to use troi has our recommendations system we are dependent of how troi works. But if you don’t want to make troi stable for third-party users it’s not a problem, I can drop troi implementation has I said. Just need to know.And if troi is not the way to have lb recommendation, I suppose we can consider that Recommendations — ListenBrainz 0.1.0 documentation endpoints are stable ?