MovieBrainz?

ijabz · July 15, 2016, 5:56pm

Well Im glad it useful to you i was just saying it didn’t seem the most useful offshoot we could have had. I myself also have quite alot of books (about 3000) but Ive never had a problem knowing if I have already have a book, much more of a problem is actually finding a book on a shelf that I known I do have, and BookBrainz doesnt help me with that.

LordSputnik · July 15, 2016, 6:50pm

I’ve actually already dabbled in MovieBrainz prior to joining BookBrainz - I did a very minimal amount of work on a schema definition in https://bitbucket.org/LordSputnik/vbschema .

My current plan is to get BB out of alpha and then generalise the code to an extent where it’s fairly easy to make new, specialised *Brainz sites - VideoBrainz, PhotoBrainz (helpful for visual machine learning?) and ElectroBrainz (electronic components, manufacturers, suppliers) would be three of my favourites.

If/when this happens, creating VideoBrainz will just be a case of deciding the entities, properties and relationships, and updating some display templates.

Hawke · July 15, 2016, 7:01pm

It would certainly be technically possible to create an ePub (etc.) tagger. Or perhaps a plugin for Calibre to retrieve/use data from BookBrainz. (Currently Calibre is pretty awful in this respect.)

thwaller · July 15, 2016, 7:17pm

@LordSputnik

The link provides: “You do not have access to this repository.”.

jesus2099 · July 15, 2016, 7:19pm

For the same reasons that I use MB: avoid buying twice the same edition (happens), discovering stuff related to the things I know and like.

LordSputnik · July 15, 2016, 7:34pm

Fixed now, was private. I defined some entities in there and some pseudo-RDF tables.

thwaller · July 16, 2016, 2:31am

@LordSputnik - how do people collaborate on topics like this? I may not have a ton of time, but I can contribute to what you have already started.

Freso · July 16, 2016, 5:44am

HollywoodBrainz would be worse than MovieBrainz. A minority of video produced is actually from California, or even the US, even if it may not seem like it from the Western English speaking world.

I think I’m leaning towards VideoBrainz, which additionally uses a 2-letter abbreviation we do not have in use just yet.

A lot of relationships pertinent to music videos are not really directly relevant to MusicBrainz though: non-musician actors and extras, dancers, choreographers, video technicians/engineers (lighting, video editing, …), directors, script writers, scenographers and prop makers, … - basically a lot of roles that are 2nd nature to movie, tv series, and other video production.

You can always link them up. You can currently link to MusicBrainz entities from BookBrainz entities, and to BookBrainz entities from MusicBrainz entities. (Just like we can link to IMDb, Wikidata, and a bunch of other sites/databases.)

Freso · July 16, 2016, 5:50am

Made by @stanislas for Google Code-in 2015.

LordSputnik · July 19, 2016, 7:37pm

Note that Stanislas’ Calibre plugin is currently broken due to us closing down the old web service. However, it worked well when the WS was around and functionality should be restored as the new API is built over the next couple of months.

dukeja · July 25, 2016, 5:00pm

I was reluctant to jump in on this conversation since I don’t see how I can contribute to this effort in any substantive way; but it’s something I would very much like to see happen.

I’ve been a user of My Movies (http://www.mymovies.dk/) for many years and their database does seem to be fairly high quality by my estimation. It has one obvious flaw, however, in that it isn’t open and drawing upon their data might be problematic. I point to them just to add them to this conversation - they have done some good work that the Brainz community might want to keep in mind. Personally I’d like to move away from them - but for my purposes they’re the best database out there.

I like much of what the Brainz community has done from a data modeling perspective much better; and I like the Brainz community’s approach to contributions better as well.

Regarding some other things discussed here…

I definitely see the need for relationships between the new “VideoBrainz” database and other existing services. A few examples off the top of my head:

Soundtracks (composed for the Movie - John Williams anyone?; or popular music used - “The Graduate”?? or just about anything from Stanley Kubrick)
There would be significant sharing of Artists in Videos and Music
Movies are often based on Books - and sometimes books are based on, or extend the story of, Movies
What about the idea of “Plays”? Under books???

dukeja · July 26, 2016, 6:07pm

So if you’re interested on leading a project, definitely drop by the MetaBrainz IRC meetings and talk about it

Ok - I can’t stop thinking about this. I totally understand that to really make this happen, someone has to stand up and lead a project to get it done. I really don’t think that’s me - but no one else seems to be standing up so…

Is IRC really the way to check out what the next steps are? I’m really reluctant to foist my noobish self upon all the cool kids who hang out on IRC.

thwaller · July 26, 2016, 6:20pm

@dukeja I am willing and able to help, but I am unable to give what I believe enough time to be the full time lead. My assumption could be wrong, but I am just as willing to work together if you are.

dukeja · July 26, 2016, 6:58pm

Frankly, I’m in the same boat. But I want to do something. A big part of this is that I don’t know what it takes to get a project up and going and to keep it going in MetaBrainz. What I’d really like to do is start moving the idea forward specifically:

Define what exactly this new project would provide. Ideas are still all over the map and a consensus needs to be reached. I have many ideas in this area - but I’m waiting to share them when I better understand the proper way and form for sharing them.
- What are the use cases? I know there are many. As @jesus2099 said before: “Yeah? Well… that’s just… like, your opinion, man… ”. People use MusicBrainz for a wide variety of purposes - there is no single use case. There certainly are dominant ones, and ones the community chooses not to support, but that doesn’t make the use cases invalid; just a matter of priorities. One of the things I love about the MusicBrainz data modeling is the way it manages to satisfy a large number of different use cases. It seems pretty clear that this new service would serve a similar variety of use cases.
- What is the system boundary. In other words, what is the responsibility of this new service; and what is not. What other systems will use this new service now and in the future. Are there established or perhaps competing interfaces/standards/conventions that we would be wise to adopt?
- What are the intended communities?
Then:
- Create a name. The name should capture the purpose. But we haven’t established the purpose yet; so let’s not put the cart before the horse. Having said that - I do like VideoBrainz.
- Work up the schema. As an Architect, I can’t help but think of where this project might be going; and my great dream is to have one consolidated database where all the *Brainz projects are a part. I know - I’m dreaming. But even if that is a pipe-dream, I’d like to leave the door open and make the schema reasonably compatible.

Like I said - I can’t stop thinking about it. So I’ll wrap this little note off before it becomes too large.

thwaller · July 26, 2016, 7:23pm

To start, I do like “VideoBrainz”.

My intentions when I started this post was a need, desire actually, to replace TMDB and TVDB, specifically TVDB. There would also be a need fr a open API, not like the IMDB one, but more like the TMDB one.

The intent is similar to MusicBrainz, just as it is to TMDB and TVDB. A database of movies and tv series episodes. It can be used as a reference, a ‘tagging’ item for personal digital copies, etc.

There are many discussions from others on TVDB I can share, they do outline the problem, but they are a bit harsh, but are to the point with facts. I found these after a Google search for the problems I am / was having and it turns out I am not the only one with my opinions on the matter.

Zastai · July 26, 2016, 7:28pm

I would say that the most important thing is to really work on tge schema, trying to keep a very broad view of things you may want to support. I’m sure this will draw heavily on musicbrainz (you have artists, works, etc).

A big part of that would be defining scope. “Films and TV” is fine; but would that include every moving image posted on youtube/facebook/…? If not, what notability/length/… requirements would there be? Would adult content be allowed, or should that be set aside for a potential future PornBrainz? If disallowed, that also needs to have a clear guideline to determine what is and isn’t adult. And so on.

It would certainly be good to have people from multiple cultural backgrounds looking at those things, to avoid an excessively western viewpoint.

Naming is always hard, and will depend on the scope choices made; but I do like VideoBrainz.

Once the schema is there, the rest can flow from there, given enough motivated developers (who can agree on an implementation language; MB is perl/js/java, BB is python iirc (a language i hate with a passion); so I guess VB should be in asp/vb/vbscript ).

The web service could be dual xml/json similar to what mb has, or just json. I would personally like it if attribute/type names were chosen such that serialization is as trivial as possible in as many languages as possible (i.e. no hyphens, no names that might also be keywords, etc).

dukeja · July 26, 2016, 8:35pm

I disagree with this; although not by much. Developing the schema is the first technical thing to be done. What I’m really after is to define an Ontology. A schema is a technical means by which portions of the Ontology can be recorded. But even before we can create an Ontology, we need to understand the use cases. Because a schema or ontology is fundamentally about how to represent all things in a domain and their relationships to one another. But how you break down a domain depends on how you want to use that schema/ontology. What do we want to do? People seem to think that is obvious. Maybe I’m dense, but it doesn’t seem that obvious to me. At least not in the finer details.

So, in an attempt to define the various use cases people would want from this service. Here is the list of things I, personally, am interested in. I’d really like to hear from others on what they’re interested in:

Content types
- Movies
  - Different versions (i.e. “Director’s cut”, alternate endings/beginnings, etc.)
  - Previews
  - Featurettes
  - About the movie documentaries
- TV
  - Series & Episodes + specials
  - Live broadcasts
  - Commercials
- Online videos (e.g. You-tube)
- Music Videos
- Recorded live performances (theater, or music)
Uses
- Collection Management. What I mean by this is that I “own” a large collection of content - primarily Movies - on DVD or Blu-Ray. Some from online sources. I’d like to track what I own, what I plan to own, etc.
- Media playing. I’d like to watch the content I own. I greatly enjoy my music playing experience now that it has been enhanced with the excellent metadata provided by MusicBrainz. I’d like the same thing from VideoBrainz.
- Browsing. I like doing casual research related to the videos I enjoy. I enjoy reading about the Artists I like (or dislike, sometimes); finding more about the history of something; finding out what else someone has made. This may be simply for curiosity or for the purpose of finding out what I want to purchase next.

Those are my main uses. What are some other things people want out of a service like this? Can we enumerate them?

thwaller · July 26, 2016, 8:35pm

To add my opinion on the above… it would not include “every moving image posted on youtube/facebook”. Adult content would be fine, with an option to label it as such. If there are content concerns, one could always require an account to view and have an option to block adult content on account preferences. Adult content would be as a released film, not including any random clip. It is also worth noting that there is a difference between adult content and porn. How to distinguish there would need to be determined.

I think for data we can look at similar to TVDB and TMDB. The design for the data is good, I just prefer how the *Brainz does the implementation. To look at the data fields of TVDB, TMDB and IMDB (others too like TVRage) would give the needed fields for data.

There will be the issue of independent films, short films, etc. I can see that as similar to bootleg releases in MusicBrainz. I would think to handle like this example… MB will add a YouTube video as a release. Looking at that release, it is a complete recording of a song. So, should a short film be on YouTube, that would be fine. But for a clip to be on YouTube (outside of trailers), it would not be included. Just like if someone tries to add a 30 second portion of a recording as a MB release, I am sure all would say that is not a release (exceptions may be there for promo type stuff).

dukeja · July 26, 2016, 8:56pm

I really see that particular case as one for the MetaBrainz organization. I haven’t looked at the terms of service of MetaBrainz and related services; but I suspect that there are some provisions regarding content that is illegal. Some content in this area would most definitely be illegal. Other than that, I don’t see VideoBrainz as adding any additional restriction on the content.

Theoretically, yes. It’s a matter of what we define in the ontology. Would a 30 second clip count as a Movie? I highly doubt it. But would it be in the scope of VideoBrainz? Sure, why not? - if there are enough folks with such material to provide the metadata on 30 second clips of movies to justify the effort.

Depends on the categories of “Videos” that we decide upon. That depends on what the definition of “Movie” or “TV Episode” is and so forth. This is along the lines as the many and never ending discussions that occur on MusicBrainz regarding the categorization topics there.

I’m all for it. Do you know of anyone to recruit? I’m not sure how much cultural differences will impact the schema, however. How have cultural differences impacted the MusicBrainz schema? It doesn’t seem to me to have had much of an impact on the schema beyond accommodating localizations and various location data.

Seems we’re all in agreement here. VideoBrainz seems to be the consensus.[quote=“Zastai, post:36, topic:19958”]
Once the schema is there, the rest can flow from there, given enough motivated developers (who can agree on an implementation language; MB is perl/js/java, BB is python iirc (a language i hate with a passion); so I guess VB should be in asp/vb/vbscript ).

The web service could be dual xml/json similar to what mb has, or just json. I would personally like it if attribute/type names were chosen such that serialization is as trivial as possible in as many languages as possible (i.e. no hyphens, no names that might also be keywords, etc).
[/quote]

Well, this is diving down into the technical details a bit soon for my tastes. I think the decision will largely be left to the volunteers who actually do the work. Whether I like a certain language or framework or not doesn’t amount to anything if I can’t get anyone to actually do the work. And if I actually am doing the work, I should definitely have a large say in the technology that is used. MetaBrainz itself would of course have a huge say, of course. They would be hosting it and would have larger lifecycle concerns that should influence the decisions. Having said that, with regards to the API, we should consider what our potential clients/partners are and try to do things in a way that would make their job easier.

dukeja · July 26, 2016, 9:04pm

I need to look into TVDB and TMDB more. I’ve never used them and am completely unfamiliar with them. For the most part I like the schema that My Movies uses. As an exercise, I’ll try to write up a comparison.

Totally fine with that. Something else to consider regarding YouTube and similar sites are the idea of podcasts. Youtube channels as the primary publishing source for content is a thing. And it’s a growing thing.

I’m curious. @thwaller: can you elaborate on how you use TVDB and TMDB? What do you specifically do? What do you like about it and what do you dislike about it?