Data import from Bookogs

Surprise, surprise!
I have to admit, I really thought that this idea has been dropped (too many problems with the different structure).
But yesterday there was a post at metabrainz:

I’m linking it here because I think that not all of us are visiting this page on a regular basis.
I’m really impressed and, yes, my heart skipped a beat :smiling_face_with_three_hearts:

2 Likes

You and me both. I thought the idea had died a natural death. If @asymmentric can make this a reality, then he deserves a medal (or two). Here’s hoping🙏

If @asymmentric needs any explanations about the Bookogs structure then I’m happy to assist. Just don’t ask me any technical questions about how they coded the site.

2 Likes

Thank you @indy133 for such a warm response. Now I’m even more excited and speeding up my work.

3 Likes

Hi @Deleted_Editor_2265540 , I’ve been going through the bookogs dump lately, and found bookogs has many more sub-categories as compared to BB. It’d be great if I could get any suggestions on how to group the related sub-categories into one, but at the same time it’s a dilemma that we’d be losing out on those sub-catagories.

1 Like

They certainly did! Far more than what was really necessary. Some of the Credit roles didn’t even make much sense to me.

You might already know this: @Madir who did the preliminary work on the Bookogs dumps posted some comments regarding the various credits and formats: Matching bookogs credit roles to bookbrainz and Matching bookogs formats to bookbrainz

I am happy to work through his list of Credits and consolidate wherever possible to a similar BB relationship role.

The other topic that needs discussion is whether there are plans to expand the current BB relationship roles, e.g. an editor role which currently doesn’t exist.

1 Like

I just re-read those topics by Madir and came across this comment by @reosarevok

That would be a welcome feature on BB.

If there’s important stuff that is missed in the import, because of missing types, it would be nice to have annotation strings that can later be used by a bot, or something like that

If you end out mapping out what goes where it might be good to note what’s fine to be dropped and what should be future proofed @Deleted_Editor_2265540 :raised_hands:

1 Like

There was a comment that anything with a frequency count less than 1000 could be dispensed with and I agree. I would consolidate all of the editorial type roles into a single role, and ditto for copyright. But then again I’m pretty ruthless.

There is a fundamental question that I have never asked in the Forum: what is BB trying to achieve? There are a number of book sites on the internet, some are basic and others allow more detail.

The main point of difference that Bookogs offered was the high level of detail.

Is it worth trying to preserve the high level of detail many of the Bookogs submissions contained or simply go for the basics? Will BB ever utilise the data?

1 Like

Just a quick look at MB mght answer your question:

Please don’t forget that this is just the beginning of the journey…

And don’t ask me why i chose ABBA as an example…

2 Likes

I apologize as this thread has gone slightly off topic but it does have a bearing on the Bookogs data import.

I realize it is early days and there is not an atom of malice in my comments. What I’m trying to determine is whether anyone has considered the future direction of BB.

From a sales perspective, there are a number of established competitors in the market and there needs to be a feature that makes BB stand out.

The point of difference could be the level of detail permissible. If that is the direction then the Bookogs data should be preserved in its entirety.

P.S. that is a far superior submission of ABBA - Greatest Hits (Vogue 528047) than the one on Discogs! Why did you choose ABBA as an example?

1 Like

MetaBrainz is famously disinterested in this aspect - but obviously attracting more people helps the altruistic mission.

I would say we tend to aim for the highest level of detail that is practical, and achievable. In the past this has been to the detriment of the UI etc, but quite frankly there’s no reason why that needs to be the case.

Ideally we would be able to add all the granular relationships from Bookogs, but address why you thought it was crappy over there :thinking:

I think a good way to do it is to have types and the sub-types, so if anyone’s not sure they can just pick the parent, and scripts can be used to pick what level of detailed is wanted. Not sure if this would address your issues with having heaps of under-used relationships…

1 Like

I know that sort of language is complete anathema to some people, but the site has to be relevant, otherwise it will just languish despite all the great work done by @indy133. Long may he live.

Crappy is a subjective term💩 I thought some of the roles were a bit arcane (which could be an indication why some were rarely used) and searching for a particular role was sometimes tedious. Wading through the editor roles required a cut lunch (Australian term meaning a light meal put in a container). However, there were constant requests from users for more and more.

1 Like

You should see my wishlist for MusicBrainz attributes :joy:

Wading through roles/searching being tedious does sound like a UI problem, that maybe could be fixed by a search box/filter box that narrows down results as you type, and maybe a paste/repeat relationships function?

Now, I’m not saying this would all be implemented any time soon, but MusicBrainz philosophy has definitely been to enter information as granularly as possible, and not let current UI, end user needs etc, get in the way. Data first, all the way.

1 Like

Bookogs had all of those features. The problem was that there were, for example, so many editor roles that unless you knew the exact title and could type it, you would have to negotiate a menu of 61 possibilities. I’m guessing some people didn’t bother and simply chose Editor regardless of what the submission article credited.

Add to that, language misunderstandings and the situation got messy. I know French and Italian users frequently used “Editor” for the role of “Publisher” on Bookogs.

1 Like

Is this a problem? It’s what I expect users to do in MB, and if someone knows better they can improve the specificity later.

Language issues happens on MB as well… particularly different names for cover art I’ve noticed. A lot of French users set stuff as ‘liner’ for some reason? Apart from improving the translations and then hope people use the site in their language I’m not sure if there’s a solution : (

1 Like

Forget the result and just focus on the procedure. The point I’m making is the sheer volume of alternatives made the system difficult to negotiate.

P.S. I shouldn’t have gone off on a tangent.

I don’t think it’s a tangent, I think it’s relevant to the direction of BB whether the problem can be solved with UI changes or not*. And it relates to how the Bookogs import should be approached!

I don’t think I’m exaggerating when I say that you and Indy133 will have a big influence on how BB handles this stuff.

*still not sure, sounds like only partially

1 Like

These problems can’t be solved, we have to live with it. The German role “Herausgeber” can be translated as publisher AND as editor. So you have to chose. Sometimes it’s obvious, sometimes it’s difficult. This applies to all languages, of course.

1 Like

It was tangential to the argument I was making. I’m a Ramblin’ Man in more ways than one.

That also applies to French and Italian (no doubt countless other languages).

Ah, I don’t see arguments, I only see solutions :stuck_out_tongue_closed_eyes:

edit: a business card quote if I’ve ever heard one!

1 Like