GSoC Pre-Proposal: Integrate Remote Playback and Listening Now
Organization: MetaBrainz Foundation
Project: ListenBrainz Android
Applicant: Anuj
Mentor: jasje
Estimated Hours: 175
Difficulty: Hard
About
The ListenBrainz Android app has two incomplete features. First, Listening Now which shows what the user is playing right now was removed from the UI at some point, even though the API integration is still in the codebase. Second, Remote Playback through Spotify and YouTube can only play one track. There is no queue, no auto next, nothing. My pre-proposal is to fix both by shared KMP module that watches all three playback sources BrainzPlayer, the LB WebSocket, and the Remote Playback queue and decides what the mini player should show. The manager is a ListeningNowViewModel in commonMain that outputs one PlaybackUiState at a time. The UI just renders whatever state it gets. Device detection figuring out whether a Listening Now event came from this phone or some other device is done by comparing the incoming WebSocket event against what ListenSubmissionService just submitted. Spotify SDK and YouTube code stays in androidMain behind an expect/actual interface so commonMain stays clean for iOS later
Current State of the Project
The shared KMP module already has commonMain, androidMain, and iosMain set up, with a repository layer, model classes, and preferences already there. This is where the new code will go. The Spotify SDK is already in its own Gradle module called spotify-app-remote, with spotify-app-remote-release-0.8.0.aar wired in, so that part does not need to be set up from scratch.
currently if a track finishes, nothing happens. There is no queue, and the app has no way to know when a track ends. The LB Socket API integration exists in the code but the Listening Now UI is gone, so the user never sees any of it. ListenSubmissionService already sends scrobbles to the LB server whenever this device plays something that is the main thing for detecting whether a Listening Now event came from this phone or another device. BrainzPlayer works fine and its mini-player behavior should not break.
Problem Statement
Right now the app has three separate things going on BrainzPlayer, Remote Playback, and the server-side Listening Now signal and nothing connects them. There is no single place that decides which one matters at a given moment. So Listening Now never appears in the UI, the app cannot tell whether you are listening on this phone or on your laptop. The mini-player has no way to show the right thing because nothing is telling it what the right thing is.
Proposed Work
Build a ListeningNowViewModel in commonMain that watches all three playback sources and produces a single PlaybackUiState. Add a queue manager in androidMain that feeds songs from LB playlists to Spotify one at a time and detects when a track ends. Add a ListeningNowRepository in commonMain that maintains a Ktor WebSocket connection to the LB server and handles reconnection. Add device detection using ListenSubmissionService. Wire everything through Koin.
Architecture and Technical Approach
PlaybackUiState
All playback states are represented as a sealed class in commonMain:
sealed class PlaybackUiState {
object Idle : PlaybackUiState()
data class BrainzPlayerActive(val track: Track) : PlaybackUiState()
data class RemotePlaybackActive(val track: Track, val queue: List<Track>) : PlaybackUiState()
data class ListeningNowOtherDevice(val track: Track) : PlaybackUiState()
}
The UI layer makes no decisions. It gets one of these four states and renders it. All the logic is in the ViewModel.
ListeningNowViewModel in commonMain
This ViewModel extends KMP ViewModel() from org.jetbrains.androidx.lifecycle:lifecycle-viewmodel and combines three flows:
combine(
brainzPlayerRepository.playerState,
listeningNowRepository.listeningNowFlow,
remotePlaybackController.queueState
) { playerState, listeningNow, queue ->
resolveUiState(playerState, listeningNow, queue)
}.stateIn(viewModelScope, SharingStarted.WhileSubscribed(5000), PlaybackUiState.Idle)
resolveUiState applies the priority order from the project description: BrainzPlayer active wins; otherwise check Remote Playback queue; otherwise check Listening Now and determine whether it came from this device.
Ktor WebSocket for Real Time Listening Now
ListeningNowRepository lives in commonMain and uses Ktor’s WebSocket client to stay connected to the LB Socket API. The connection is tied to viewModelScope so it opens when the ViewModel is created and closes automatically when cleared. Reconnection uses exponential backoff:
fun observeListeningNow(): Flow<ListeningNowEvent> = flow {
client.webSocket(LB_SOCKET_URL) {
for (frame in incoming) {
if (frame is Frame.Text) emit(parseEvent(frame.readText()))
}
}
}.retryWhen { _, attempt -> delay(minOf(2.0.pow(attempt).toLong() * 1000, 30_000)); true }
RemotePlaybackController expect/actual Boundary
The interface is in commonMain. Android implementations are in androidMain. iOS gets a stub.
// commonMain
interface RemotePlaybackController {
val queueState: StateFlow<RemotePlaybackState>
suspend fun playPlaylist(tracks: List<Track>)
suspend fun skipToNext()
}
expect fun createRemotePlaybackController(): RemotePlaybackController
androidMain provides two implementations SpotifyRemotePlaybackController using the existing spotify-app-remote module, A factory picks the right one based on the playlist’s track sources.
Queue Manager and Track-End Detection
The queue manager holds an index into the current LB playlist. Spotify SDK does not have a track end callback, so the app has to obtain it from PlayerState. When isPaused == true and playbackPosition == 0L, the track has ended or the user paused at position zero. The queue manager tracks lastKnownPosition to tell these apart:
// androidMain — inside SpotifyRemotePlaybackController
spotifyAppRemote.playerApi.subscribeToPlayerState().setEventCallback { state ->
if (state.isPaused && state.playbackPosition == 0L) {
scope.launch { playNext() }
}
}
Device Detection via ListenSubmissionService
The LB server does not tell you which device submitted a listen. To figure out if a Listening Now event came from this phone, the app keeps a LastSubmittedTrackCache a StateFlow<SubmittedTrack?> that ListenSubmissionService updates every time it submits a scrobble. When a Listening Now event arrives, it is compared against the cache:
fun isThisDevice(event: ListeningNowEvent, cache: LastSubmittedTrackCache): Boolean {
val last = cache.lastSubmitted ?: return false
val timeDiff = abs(event.timestamp - last.timestamp)
return event.trackName == last.trackName &&
event.artistName == last.artistName &&
timeDiff < DEVICE_DETECTION_THRESHOLD_MS
}
The tolerance window and why it is set to 10 seconds is explained in the hurdles section.
Koin Wiring
val playbackModule = module {
single<ListeningNowRepository> { ListeningNowRepositoryImpl(get()) }
single { LastSubmittedTrackCache() }
viewModel { ListeningNowViewModel(get(), get(), get()) }
}
The RemotePlaybackController actual is provided in a separate androidMain Koin module.
Known Hurdles and Edge Cases
Hurdle 1 Spotify SDK Has No Track End Event
Spotify App Remote SDK 0.8.0 gives you PlayerState callbacks track URI, position, pause state but nothing that says “this track just finished.” You have to guess.
The obvious guess is isPaused == true && playbackPosition == 0L. The problem is this also fires when the user manually pauses right at the start of a track. To filter that out, the queue manager tracks lastKnownPosition from the previous callback. A track end is only accepted when all three are true:
isPaused == trueplaybackPosition == 0LlastKnownPosition > MINIMUM_PLAYBACK_THRESHOLD_MS(3 seconds)
This still has one known false case: if the user plays for 3 seconds and then pauses exactly at position zero, the queue manager will call it a track end. This is a limitation of the SDK.
There is also the case where Spotify is not installed at all. SpotifyAppRemote.connect() will fail, and RemotePlaybackState needs to move to Unavailable so the UI can show something useful instead of just breaking silently.
Hurdle 2 YouTube Has No App Remote SDK
Spotify lets you open a direct connection to the Spotify app and send play commands to it. YouTube has nothing like this. There is no official way to tell the YouTube app to play a specific video from outside.
The approach I am considering is using the android-youtube-player library by Pierfrancesco Soffritti (com.pierfrancescosoffritti.androidyoutubeplayer:core:13.0.0). It wraps YouTube’s IFrame Player API inside a WebView, is available on Maven Central, needs no API key, and has no quota limits. It also gives a clean PlayerConstants.PlayerState.ENDED callback through YouTubePlayerListener, so track end detection for YouTube is straightforward no position polling needed, unlike Spotify.
The concern is that this plays YouTube videos inside the app via a WebView rather than delegating to the YouTube app. Background playback without registering YouTubePlayerView as a lifecycle observer is technically possible through the library, but the library’s own documentation warns this violates YouTube ToS for Play Store published apps.
I want to confirm two things with the mentor:
- Is this the correct library for playing YouTube videos?
- Is it okay to use it for background playback in this app?
Hurdle 3 Device Detection Timestamp Window
When a Listening Now event comes in over the WebSocket, its timestamp is when the LB server received the scrobble not when playback started on the device. There are three delays stacked between the moment a track starts on this phone and the moment the WebSocket event arrives:
- Network latency for the scrobble going from device to server
- Server processing before the Listening Now state updates
- WebSocket push from server back to the app
On a good connection this is under a second. On a slow one it can be several seconds. The 10second window in isThisDevice() is meant to cover the realistic range.
The window creates a false positive case though. If the user has two devices and both start playing the same song within 10 seconds of each other, the detection logic will incorrectly say both are this device. The server does not track which device submitted a listen, so there is no clean way to resolve this. The 10 second window is a conscious trade off. A tighter window cuts false positives but starts failing on slow connections. It can be made configurable if needed.
There is also clock skew. If the device clock is significantly off from the server clock, the timestamp comparison always fails regardless of the window. The fix is to measure time elapsed since the local submission was sent, and compare that against the difference between now and the event timestamp rather than comparing the two timestamps directly. This makes the comparison relative rather than absolute.
Hurdle 4 WebSocket State After Reconnection
The WebSocket drops when the network changes or the app goes to background. When it reconnects, the server pushes the current Listening Now state right away. But during the gap, the user may have stopped playing or switched tracks on another device. The last known local state could be wrong.
So on every reconnect, ListeningNowRepository makes one REST call to fetch the current Listening Now state before letting the WebSocket stream take over. This covers the gap. REST for the initial state on connect, WebSocket for updates after that this is how the LB Socket API is designed to be used anyway.
Hurdle 5 Race Condition at Startup in resolveUiState
combine waits for all three flows to emit at least once before it produces anything. At startup, BrainzPlayer state might emit immediately while the WebSocket connection is still opening. This would leave the ViewModel waiting and the UI stuck.
The fix is to give each source a default initial value so combine can emit on the first tick without waiting:
brainzPlayerRepository.playerStatestarts asInactivelisteningNowRepository.listeningNowFlowstarts asnullremotePlaybackController.queueStatestarts asRemotePlaybackState.Idle
With these in place, resolveUiState has something to work with from the first emission and returns PlaybackUiState.Idle until real data comes in. That is the right behavior.