MovieBrainz?

I am agreeing with this mostly. When you look at the “rules” in MusicBrainz, meaning what the majority of auto-editors enforce, publically available bootlegs (private) even if available online sometimes do not qualify as a release. So not to call anyone out, I will just say I have examples of this being strictly enforced. I have come to learn that this does not matter to me either way. I was more pro private data when I first started, now I am more against private data, even mildly public data…

The mindset I have come to have, based on and molded by mostly auto-editors in MusicBrainz, is to look at the release and ask a few things. First, is, or was, this release available to the public in a manner that would give it distribution? Second, would the adding of this release benefit anyone aside from myself? And lastly, is the data I have to add complete enough that if no one ever is able to add anything to this release, does my add fairly portray a release that can be identified or of any use?

The last question is more open. If all I have is a track list and recording times with nothing else to offer, to me, that is not a release as it is useless. There is no cover, label, specific release info, source, etc. How would anyone ever accurately match this to their content?

I hope I have explained what I have been taught here. Although I have more or less signed onto this logic, I am not dismissing of the idea to go against it. I just wanted to toss this logic out there for all to consider in addition to what @LordSputnik stated above. It is my belief now that a private video (someone kid playing, a graduation, people at a shooting range for fun, etc) is no different than a home-made compilation of music.

1 Like

Thank you. I do think these issues could be resolved; but it would take a lot of time and effort and would introduce concepts that are currently foreign to the Brainz community; and probably would impact the performance of certain queries. I think that to work properly, private metadata would need to appear as if it is not there at all unless you explicitly choose to include it. So far as reviews and editors are concerned; it’s certainly true that the community cannot verify the accuracy of private data. But it can enforce certain standards of quality such as completeness, internal consistency, spelling and grammar, quality of images provided, and proper attribution to a limited extent.

Another approach would be to address the “Collection Management” use case directly. By that I mean that there would be completely separate and simplified tables to handle private data. This data would only be visible to the account that created them and would be provided purely as a service to that user. Cloud storage for personal media metadata, if you will. But that definitely goes counter to the established purpose of the Brainz community. As an aside, I really want to refer to the community as the MediaBrainz community.

So I guess I’ll have to find some other avenue to satisfy the Collection Management use case. I fully expect to maintain the idea of personal collections as MusicBrainz does. But that would still be inadequate because it will lack certain critical features. Together with the Media Player use case; these two use cases are very important to me. I care as much or more about my private media as I do about the public data. I want to collect extensive information about them; just as much as I want to about the public items. I want that data to be on the internet so I can share it with those few people who care (my extended family and friends), and with myself - I love the cloud paradigm. But currently I have few options; and all are inadequate in several ways. Sad Panda :disappointed:

just wanted to chime in here and say I really liked this idea; the limitations of it makes sense as well. (perhaps this is a subject for another topic?)

My initial reaction to this is that I don’t really like it. My preferred approach is to have the pure data model recorded in the database - so there is no difference in that aspect. But that the ONLY way to work with the data, apart from certain administrative/maintenance functions, is through the web service. EVERYTHING else works through that. All the business logic associated with the Ontology would be enforced in the web service. I’m far more worried about duplicating the business logic and enforcement of policies in the clients than I am about duplication of the schema. As I understand what you have setup, critical business logic will need to be implemented in the client side JavaScript. That will inevitably lead to duplication of the business logic in other client libraries or in the clients themselves.

Yeah - rate limits and such are absolutely critical. And yes, we’ll need to support various mechanisms of the OAuth 2.0 protocol including a token/secret approach. I also think that much of the interface will work without authentication. Essentially, unauthenticated connections would be read-only. No edits, no comments, no ratings, no usage data submissions.

Speaking of OAuth 2.0. How is it deployed/integrated in existing MetaBrainz servers and services? Is this deployment/integration evolving? What is the end goal?

1 Like

The mention of free-form relationships so anyone can just add a role doesn’t sit well with me. Even leaving aside the quality problems (misspellings, case differences, …), this is very English-centric. In order for the data to be localizable properly, I think relationships need to be curated properly.

1 Like

Hmmm. I’m trying to follow this thought in the context of this thread and I don’t see it discussed in the context of VideoBrainz much. Are you bringing in topics that have occurred elsewhere into this discussion - because I’m confused. However, your comment does bring in several topics worthy of discussion. I see issues of editorial rights, schema design approaches, and localization entering in with your comment.

On Editorial rights. I don’t see why every account should have equal access to the database. I believe our purpose is to create a free and open database of video metadata of the highest quality, accuracy, and breadth. If that is the case, we have several problems we need to confront.

First, we need to encourage people to become contributors. The more contributors we have, the more data we will accumulate. The videos I’m most interested in will differ widely from other people. And by pulling from as broad a base as possible we will have a database that spans as many interests and cultures as possible. However, that comes with its own problem. Not all contributors do good work.

And that brings me to the second problem: we need to encourage high quality contributions. We cannot rely entirely on the editorial process to keep the data of high quality. As Oliver Charles pointed out years ago in this blog post, submissions are outpacing the ability of reviewers to look everything over. While the new editing system is designed to make matters better; I highly doubt it will eliminate it. We will need a hierarchy of accounts with increasing levels of access to the database. What those tiers are, what rights go with them, and how people move up and down the tiers is not something that I’m ready to discuss. But the existence of different levels of access and editorial rights I think is clearly needed.

Schema Design Approaches Part of the ontology will be various taxonomies that categorize things. Some of these will occur in relationships. For example, suppose we have a “Movie” entity, and a “Person” entity. We may then have relationships between those two kinds of entities. A specific Movie will have a whole host of Person’s who contributed to that Movie in one form or another: the Cast and Crew to employ the terms typically used. So we’ll have Cast relationships which will also include what “Role” that Actor performed. We’ll also have Crew relationships which will describe the Job that Person had in the production of the Movie. Will we have a semi-fixed taxonomy of jobs? If we do, how is that taxonomy created and maintained? Will it evolve over time? I think that this list will exist, and that it will change over time, and that the process for changing that list will be similar to changing any other part of the schema - in other words - very difficult and with significant impact. Changing any taxonomy that classifies items will imply that all data created using the older version of the list may need to be changed to use the new categorization system.

One approach that we can have to managing such taxonomies is to represent the taxonomy in the database itself. Not as part of the database definition; but in tables of its own. But having those represented in the database does not mean that they are freely editable.

Localization Localization needs to be a first tier capability. What I mean by that is that localization of all data should be incorporated in all of our designs from the very beginning. The discussion on instruments with disambiguation comments has been very instructive. As in the case of instruments, job titles in VideoBrainz will have need of translations. We are in a good position to ask the MusicBrainz folks: “If you could design the schema from scratch now - what would you do different?” Maintenance of the MusicBrainz schema is significantly burdened with a large existing database and mature collection of software built up. Fundamental changes to the schema represents a HUGE undertaking. VideoBrainz currently doesn’t have that burden so has the opportunity to make fundamental changes to the approach.

1 Like

My other reply really went on several tangents and didn’t directly address this comment. I have a couple questions: How is the use of free-form relationships “very English-centric”? How does structure of the schema become language specific?

What do you mean by “free-form relationships”? Where was that discussed? Can you point me to the discussion?

I assume that’s about this section:

That sounds a bit like you want to have the user just write down what the person did rather than pick from a closed set of options, which makes it hard (if not impossible) to translate it simply.

This is a bad idea IMO. MusicBrainz is trying to move away from votes and auto-editorship and towards a Wikipedia style “just revert errors” philosophy, because of multiple reasons, but among them, that if you expect most people add good (or at least not bad) information, putting roadblocks on them is not ideal, and it discourages additions - which, when they happen, either go unnoticed, or are policed too strictly (since some users would rather reject any submission that isn’t perfect, which is clearly problematic because not having any data is worse than having imperfect data).

There are a few things that we do intend to keep limited (who can add new relationship types, probably areas and instruments) but the general idea is that there should be as little a difference as possible between all users, and that we shouldn’t roadblock people (like we currently do with the 7 days voting phase).

2 Likes

I think you read too much into what I said. In fact, you said: “There are a few things that we do intend to keep limited”. So you agree that there are multiple levels of access? In fact, I agree with all that you said. But it’s important to recognize that, however limited, not all editing is equal. Some things must be controlled. I would even go so far as to give certain “moderators” the power to, very judiciously, lock certain entries or to restrict certain editors. It’s a rare occurrence; but sometimes there can be “editing” wars that degrade the database or rogue editors that seem to want to do what they want to do regardless of the rules and guidelines. Having the ability to intervene in such cases is a necessary ability; but should be used as a last resort. As a rule I am an optimist who believes that most of our editors seek the best interests of the community; and that some data is better than no data.

1 Like

That wasn’t my intent at all. In fact, I believe that we should have a carefully worked out taxonomy of jobs. Regarding the “Roles” - well, the editor would need to provide that. Unless we wanted to have a new entity for “Roles”. That could be interesting because some roles are recurrent; how many “Bat Man’s” have we had? How many “Sherlock Holms”, or whatever. Or perhaps “Person” should include fictitious persons. But I digress.

On the topic of “Jobs” for crew. There does seem to be a discernable taxonomy. But that taxonomy is itself evolving. I have an idea on how to handle the seemingly conflicting desire to control changes to taxonomies such as “Jobs”, as well as reduce roadblocks to editing as much as possible. My idea is that we should have a taxonomy - with very tight controls on making changes to that taxonomy. But that taxonomy should always include an “Other” category. When a user uses the “Other” crew job; they should also provide (or be able to provide) additional data so that those who update the Jobs taxonomy can use it as input for extending it.

My reaction was based on:

I’m totally with you here. This is exactly the thing I most like about MusicBrainz. I don’t see any problem with this. All of the things you just listed are a variation of a single kind of relationship - associating a person with the video of interest. It’s a simple relationship that states ; i.e. <“Joe Dancer”, “My Dance Video”, “Dancer”>. There will certainly be many discussions on what kinds of relationships there should be; much like there is for MusicBrainz. But in the end, these are just enumerations and not all that difficult to implement and support.

I may have misunderstood that, but it sounded like the job would be just a piece of entered data. For a role that makes sense (but can also be subject to a need for localization, e.g. in the case of children’s films that frequently have dubs and where characters will often have different names than in the original; the Harry Potter films are an especially good rxample of this). But for jobs that just makes for a messy database.

Having the jobs be a database entity (like instruments and areas) is fine (so no schema change needed for adding them). And having an Other where the UI would enforce the addition of extra information sounds goid too (avoids the “add an annotation” solution used by MusicBrainz).

1 Like

I was simply presenting a conceptual construct without addressing how “Dancer” would be represented. Perhaps I should have said <"Joe Dancer, “My Dance Video”, DANCER>. The actual mechanism we use to represent jobs in the database wasn’t addressed in that response. I do in fact support the idea of a taxonomy of jobs that is controlled and that supports localization. In fact, my line: “There will certainly be many discussions on what kinds of relationships there should be”; and “these are just enumerations” seems to support the idea that these are NOT just user data entries.

I’m in the process of making yet another media streamer/organizer/blah app myself and would love to see a central place to get good JSON metadata from an API. While the big three (themoviedb, tvdb and tvmaze) have decent data they all have their little quirks. I still don’t get what tvdb’s issue is with shows like WWE and the like. If people are providing the data why would you purge it?

Anywho, some of the limits I see using all the providers.

  1. Not as easy to tie cast/crew together via uuid and the like. Therefore I have alot of “duplicate” rows containing person data. Having 600k rows isn’t a big deal but not having matching id’s makes “also in…” queries problematic.
  2. Lack of sporting data…not really interested in game stats like yards, players, etc. But who played what/when and final score seems reasonable.
  3. Lack of international titles.
  4. Anime…this relates to #3 as well.
  5. Limited localization of returned JSON data.
  6. Ignoring of “internet” based shows/streams.
4 Likes

Another good source of data is wikidata so putting a useful api on top of it.

3 posts were merged into an existing topic: ‘brainz’ fantasy list

People who appreciate MusicBrainz and look for MovieBrainz may want to consider OMDB. Unlike IMDB, TVDB, TMDB and mymovies, it publishes its data under an open data license similar to MusicBrainz.

5 Likes

French translation is strange.
1 to 10 notations are swapped for instance, they say 1 is BEST and 10 is WORST note.
It’s the opposite in English. :smiley:
I will try to fix it.

3 Likes

One more thing that MovieBrainz could provide is a better person database if it’s shared with MB, it is really annoying to enter audio dramas because a lot of the people are actors primarily :stuck_out_tongue:

3 Likes

Has anything happened with Video/MovieBrainz? Are we still chatting or has anyone started working on it?