FilmBrainz ideas thread (i.e. MovieBrainz, VideoBrainz, etc.)

I’ve been working a lot with videos lately and building up my own library, and I’ve had some thoughts about FilmBrainz I’d like to share~ (feel free to use these, as well as any from my posts, of course)

first off, I feel like the data structure should be somewhere between BookBrainz and MusicBrainz, since we probably want Works, Videos (equivalent to MusicBrainz recordings), Releases, Release Groups, and Actors (equivalent to Authors and Artists).

I believe the video or work level should be the primary level, since releases are generally a bit less important in other databases

I think it might be a good idea to call the release and release group levels something else, since we already have different names between MB and BB, and that might make it easier in the future when talking inter-database, but not essential (especially since we can’t say the same for works). that said, the label/publisher entity can be called Studio, methinks


I feel like characters might be more important in FilmBrainz than they are in MusicBrainz, and they shouldn’t be forgotten. could handle them similarly to MusicBrainz in that they’re a type of Actor (which might get confusing, but it’s probably fine). this would especially work for when an actor plays another person, such as for a biopic (like Daniel Radcliffe playing “Weird Al” in Weird or Austin Butler playing Elvis in Elvis, etc.)


I don’t know how to work series yet, since you can have stuff like TV series (Doctor Who, Game of Thrones, etc.) which is handled one way in other databases, but you can also have film series (Star Wars, The Avengers, the whole Marvel Cinematic Universe, etc.), which is handled differently in other databases. I think we can keep it simpler with a single series entity for both types of series, perhaps with several series types.

that said, we could handle TV series-type series like we’d handle films, since most of the time there’s a core cast in most episodes (wouldn’t apply to all series, of course… Twilight Zone and Black Mirror come to mind, as well as Doctor Who)

one important thing on series, I believe we should allow for multiple default orders, such as production order, release order, story order, and more, as well as alternate names for seasons (for example, chapters, books, and series for My Little Pony: Make Your Mark, Avatar: The Last Airbender, and Doctor Who, respectively)


for release types, I see a few main categories:

  • theatrical releases, for films and shows released in cinemas (possibly including other screenings, like film festivals and conventions, but it might be good to treat these separately for original release date purposes)
  • home video release, your typical DVD, Blu-ray, VHS, or digital release, might or might not include streaming releases here
  • television/broadcast releases, more important for shows, but there are made-for-TV movies too. probably should focus on original releases, but reruns could count too, I’ve got nothing against that
  • streaming releases, could be included under home releases, important for stuff like Netflix originals as well as YouTube videos. like the above, rereleased might or might not be included, especially with how much movies move around between streaming services these days

for images, in addition to having a new Film Art Archive with Internet Archive (for release scans and whatnot), we could possibly partner with Fandom.tv for extra art? like logos, backdrops, all the extra images you might need when tagging movies and shows for a media server/video player


for work types, some I can think of off the top of my head:

  • film, for long form story driven videos
  • short, for short form story driven video
  • documentary, for medium to long form video about (a) specific topic(s), often including interviews and archival footage. could include short form video too
  • video essay, for shorter form video about a specific topic, usually by a single person, but maybe not always. could also include longer form video essays
  • song, for music videos
  • musical, for stage performances and possibly musical movies?
  • trailer, for promoting some other type, such as a film or TV show
  • bonus content (needs a better name), for interviews, behind the scenes, featurettes, deleted scenes, other bonus content typically included on home releases. could be split into multiple types

film versions are pretty important in film (directors cuts, censored/uncensored versions, international versions, dubs, whatever George Lucas did to Star Wars), so I think having relationships for these is pretty important early on (some on the work level and some on the video level). some good examples are

  • Deadpool 2, which has at least 3 main version groups, including the theatrical release, and also
    • Once Upon a Deadpool, a PG-13 edit of Deadpool 2 with added scenes with Deadpool telling the story to Fred Savage in reference to The Princess Bride
    • Deadpool 2: Super Duper Cut, adds and extends several scenes
  • My Little Pony: Friendship is Magic S2E14 “The Last Roundup” has a scene that was edited with different voiceover for a whole character after some controversy
  • Dark Midnight: Doll of the Dead, a now unlisted FilmCow video which had a scene changed to remove a somewhat racist stereotype of a Chinese person (which to be fair was in parody of old 50s shows), replacing him with a mysterious cloaked figure and also a ghost. also some minor dialogue changes
  • Scooby Doo 2: Monsters Unleashed has an international version where a Burger King logo was changed to KFC (and perhaps other changes). probably a fairly common occurrence
  • Star Wars, too many versions to list, most famously changing the order in which Han and Greedo shoot each other, but also inserting new actors in place of the old ones (namely Anakin Skywalker’s force ghost and Emperor Palpatine in several scenes)
  • Kung Fu Panda had an entirely new version created for the Chinese market, in which the lips were changed to match the Chinese dub of the movie
  • Zootopia famously has a different newscaster based on the region the film was released in (a moose for North America, a tanuki for Japan, a koala for Australia & New Zealand, etc.)
  • Clue was famously released into theaters with three different endings, so if you and a friend saw it in different theaters, you might have seen a different ending. on the home video release, all three endings are included back to back in that cut

I’m wondering if perhaps we want to have video groups as well, as to group different versions, cuts, and variants together (tho in most cases, this could be achieved through simple video-video relationships too… not sure how to handle Clue tho…)


feel free to comment below on my ideas or leave your own~

7 Likes

https://thetvdb.com

I use the above for my KODI media centre. You’ll need to dig into how they work and find their pros and cons before trying to reinvent the wheel. How will you make FilmBrainz different enough to draw in the editors you need?

The main issue I see is resources. The sheer cost of the servers and developing another system that competes with the more established names.

Search the forum for a few older discussions on the topic:

5 Likes

this is true, most people probably categorize movies, TV, and other videos differently than music or books…

I know it might be a while before FilmBrainz gets started, but I would love to see it, both for the open data and for the more open mindset. for example, I know TMDB doesn’t allow YouTube videos (in general), which is an area I’m quite interested in documenting. while IMDb does (at least to some extent), I don’t think their data is very open to everyone (for example, Jellyfin doesn’t pull metadata from IMDb. I could be mistaken on this point tho)

that said, I am happy to see Once Upon a Deadpool got it’s own entry at TMDB, tho the same probably can’t be said for the other examples I gave… (disclaimer, I haven’t looked at all of them yet)

I did also find OMDb API, which seems to (perhaps unofficially) provide movie information from IMDb, which I think Jellyfin does in fact use, but it’s not a database itself, I don’t think…

for music videos, there is IMVDb, but there doesn’t seem to be much of a userbase over there (tho there are users, since there are videos from this year)… I’ve had a well-sourced video submission sitting over there for almost a year with no response (that’s part of the reason I’ve been pushing for better music video representation in MusicBrainz, as a sidenote)

(another sidenote on the name, which may have been said before, I feel like we shouldn’t go with MovieBrainz, as that’d be three seperate projects with MB as the obvious abbreviation, even tho MetaBrainz is typically MeB. I’d prefer FilmBrainz or VideoBrainz)

4 Likes

As far as I remember they don’t even allow music videos at all?

And TVDB is quite unpleasant to deal with(Last I seen they have a system where specific people can lock a show and no one else can edit, and those people tend to have their own ideas how things should be)

Last I tried to use IMVDb they had a system where all edits need to be approved and literal months went by without mine getting approved. And from what I recall they track minimal information about the videos.

There simply isn’t a site that comes close to MusicBrainz for video currently, so I am very in favour of a VideoBrainz.

And as for getting it started, there are dumps from other video sites(though some might be old) that could be imported by a bot to kick start the site at least with basic information.

And just imagine being able to look at a movie or TV show and seeing all the music linked to it. And all the related books. Having a series of brainz sites could offer integration that you can’t get with others. Which would then offer an even better experience for media software like Plex/Jellyfin/Kodi as they could then link all of your media. Look up a person and see all the music, books, movies, tv shows they’ve been in/released/written/directed/whatever in one place.

4 Likes

These are all nice ideas.

What I don’t see: Where do you get all this information?
If you say “IMDB” or “TMDB”: why should you copy these existing information?
If you say “my own DVD or BluRay cover” then you never get informations about all streaming movies or movies that just not exist on silver discs for example.
If you say “I transscripted and OCR’d them from the end credits” then I would say: RESPECT :wink:

2 Likes

the same could be said about MusicBrainz data, for what it’s worth (I’ve actually been copying booklet credits with OCR as of late). I don’t see any particular issues with copying data from other databases, since data usually can’t be copyrighted (at least to my knowledge)


also want to add a great use case example: YouTube series. they can get quite complicated, and a MetaBrainz approach can make it easier to understand (with the right data display, of course). some examples:

Game Grumps is a gameplay channel on YouTube, with several series (Game Grumps, Steam Train, Guest Grumps, Game Grumps Animated etc.) and several sub series within them (mostly playthroughs, like Breath of the Wild, Super Mario Maker, etc.) and some one-off episodes. this could easily be handled with series and sub series

Game Grumps [series]
    Game Grumps: Breath of the Wild [series]
        Breath of the Wild, Part 1: Shirtless Hero [video]
        Breath of the Wild, Part 2: Cooking Up a Storm [video]
        etc.
    Game Grumps: Resident Evil VILLAGE [series]
        Resident Evil VILLAGE, Part 1: Worst Dad Simulator EVER [video]
        Resident Evil VILLAGE, Part 2: Should we be drinking pineapple juice? [video]
        etc.
    Game Grumps: PT [video]
    etc.

a second good example is Extra History, a YouTube channel which does multipart series on various historical figures and events. this is handled with a season per subseries in TMDB and TheTVDB

Extra History [series]
    Extra History: World War I: The Seminal Tragedy [series]
        World War I: The Seminal Tragedy, part 1: The Concert of Europe [video]
        World War I: The Seminal Tragedy, part 2: One Fateful Day in June [video]
        etc.
    Extra History: Operation Avalanche [series]
        Operation Avalanche, part 1: The Forgotten D-Day [video]
        Operation Avalanche, part 2: D-Day Nearly Fails [video]
    etc.

there are many other similar examples I could give, like OverSimplified, Sam O’Nella Academy, Minute Physics, the many series of Adam Neely, Polyphonic, Technology Connections, and more, some more complicated, some less, but 2 is probably good enough for now

3 Likes

not to keep digging up the past but I’ve been spending a good few hours of my idle time thinking about this…

what do we think would be needed to get this off the ground :thinking:

3 Likes

if memory serves, I believe the main thing is a developer or three who want to pick it up as a project. beyond that, I’m not certain…

5 Likes

yeah thats what I was thinking - sadly my development skills are rather shabby and unlikely to be anywhere close to strong enough to support such a thing… im keen on contributing towards such a project if there are any interested devs reading this :stuck_out_tongue:

4 Likes

This would be great.
Discogs had filmogs, which was unique in that it listed effective releases. Great tool if you were looking if a HD version exists of that great, but rather obscure Italian Horror movie.
But they closed it.

imdb & moviedb list the films, but not the releases.

I mean I would surely be interested in contributing to such a project, assuming it is coded in a reasonable language, but I don’t have the bandwidth to lead development of another project in addition to Harmony, my other smaller *Brainz tools and my BB contributions.

It would be great if the project would not be just another *Brainz DB project in a different (backend) programming language (MB in Perl, BB in JavaScript/TypeScript), but a framework which is designed to be usable for other entity types and not just videos…
(I’ve heard rumours that this once was the plan for BB, but apparently that didn’t become reality, so maybe it wasn’t technically feasible/managable.)
Edit: This would be useful to avoid having many databases which all have a “person” entity type. MB and BB are at least sharing areas, but have separate Artist and Author entities which sometimes overlap. Not even sure how to call this entity type in VideoBrainz… Actor? Artist? Or just Person?

In 2021/2022 I had started a personal project to achieve that, but I never got much further than designing and iterating on the database schema and modeling basic concepts in TypeScript.
It was based on the BB schema where all entity revisions are preserved, but further generalized it to the point where all entity types share the same DB tables by default.
Additionally I experimented with the implementation of n-ary relationships, which allow you to create relationships like person P played role R in video V (where P, R, and V are all proper entities with UUID – unlike MB, where the role is text only on the binary “vocals” rel).
If there were other people interested in continuing work on such a UniversalBrainz DB model, I could probably review the files which I have and upload them.

6 Likes

I actually had a similar idea bouncing around, tho I was thinking it’d simply be a way to link a third database and beyond together. like you’d link a MusicBrainz artist, BookBrainz author, and VideoBrainz actor to a single MetaBrainz (or UniversalBrainz) person (or vice versa).

actually, it could be as simple as automatically creating a UniversalBrainz entity for everything in both projects and merging them if there’s interdatabase links added? (could be a bit dangerous without proper handling)

I know Rate Your Music keeps one list of artists for their three projects, actually (music, film, and video games)

2 Likes

So because I’m a glutton for punishment I came up with the idea that I could possibly start to “capture” metadata in JSON files which could in theory be fed into some kind of big project later.

Using my general knowledge of home media releaes, and my experiences of what we were trying to capture with Filmogs and then expanding that to what we might want to achieve to meet similar levels of metadata standards like that of MusicBrainz, I have come up with the following prototype…


So because I am obviously a masochist I decided to try and base my first example around a Disney DVD. OK it’s not the worst (I could have chosen a TV DVD boxset for example) but it’s also not the easiest, but maybe a good starting point.

You can try and look at the README.md at the top of that repo, which was a “thoughts on the back of a postcard” brain dump of what I would expect the schema to sort of look like. As you can tell by looking at the JSON not everything ended up being where I originally thought.

The JSON is structured into two main “globs” of data, the first being what I call “packaging attributes” this is all about the release, from the title thats printed, to the packaging it comes in, any identifiers etc. This data is really easy to generate when you’ve got an example in front of you. Even doing this “by hand” doesn’t take a whole lot of time.

The second “glob” of data is what I am calling “title-detail” this is an area that I want to try and be as universal as possible across the different medium types, which might mean the structure of this area could get very messy, very quick. This is effectively what in MusicBrainz land we call the “track list”.

I want this section to logically display, in order, the segments of the audiovisual content. For analogue media this should be REALLY simple, as they’re very linear in the way they present the content. For digital media, like DVD, this is already starting to show problems.

As many of you probably know DVD’s allow you to go to any part of the content quickly via titles and chapters. This also means that when you have a multi-lingual release (like many DVDs are) you can present a different sequence of clips to the viewer.

In this example at the beginning of the DVD you are greeted with a menu that asks which language you want to proceed in English or Spanish. Depending on the selection will automatically set the audio track and change the language of any future menus you encounter. That’s not exactly impossible to document but can make things look a bit odd when trying to order it all.

I think maybe it might be better to have multiple title-detail objects, for as many different “paths” one could choose (in this example it is just 2, but I know from memory some DVD’s can be up to 8).

Then it came to what do we want to record about the title, and turns out that (at least with DVD) each title has its own chapters, can change aspect ratio, different audio tracks available, different angles (although not on this example) and different subtitles options too.

As you can see with this example, there a mixture of content in both 1.66:1 ratio and 4:3 ratio; again this is kind of typical for a lot of DVDs.

Along with that it is also entirely possible that a DVD could contain video content in both colour, black and white or sepia.

Maybe over the next few days I might see what I could do to get things rendering into some kind of static HTML file to get a better view of the data within.

Problems off the top of my head:

  • Painstakingly slow to do this by hand; not even thinking about trying to build tools to automatically “map” out a DVD structure, but just building the structure takes ages by hand. Ultimately I think I would need to build some kind of web-app based editor to create these JSON data files. As mentioned the thought of trying this with a multi-disc TV boxset make me want to cry!
  • Need to work out how to handle additions to the structure, it wasn’t until after I started I realised I needed a better way to lay out the audio tracks and then realised I missed out Aspect Ratio entirely

Anyway, thought I’d share my progress. Happy to hear feedback, I’m an absolute novice on this whole thing but with the way I consume a lot of physical media (certainly video media) I wanted a way that I could be capturing information and keeping it safe somewhere until I can put it into something mainly structured.

I’m no expert on any of this, to be honest I only read how to write JSON files about 4 hours ago but this is something my brain keeps wanting to think about so I gotta scratch the itch! Inb4 this turns into me running an actual database that people can contribute to :flushed:

(p.s. I 100% know that people probably want a theoretical FilmBrainz to do this and that and cook them breakfast etc. but at the moment I thought I’d start with an area that I feel people could get their teeth into)

6 Likes

I’m tempted to say that I would just forget about trying to include a “tracklist” for a medium where that might not be particularly logical - from what you’re saying it is by far the highest level of effort for low return (imo, since the “order” probably doesn’t really matter to a lot of users compared to unordered data like “includes this work/s”?)

But I also know that “our” crowd loves a data challenge. Maybe you’ve found the carrot that FilmBrainz hardcore editors can argue over for decades, keeping them around :stuck_out_tongue_winking_eye:

1 Like

Yeah I’m not sure about it either yet; there is a level of value to it, as it means you can identify an entry by what pre-roll it might have but as mentioned it can get difficult to map out when you’ve got multiple languages to support.

I’ll keep prodding away at this to see how it works with other titles.

multiple languages, commentary tracks, subtitles… there’s a lot that goes into a DVD

as an editor, I believe there is value in having a list of everything on a release, but in making a list, I suppose you do imply order where there might not be any

1 Like