Definition of works

dataimporting
works
definition
Tags: #<Tag:0x00007f2a01581c90> #<Tag:0x00007f2a01581b28> #<Tag:0x00007f2a01581948>

#1

There is a problem for importing (or exporting) in the definition of works.

Quoted from bookbrainz.org/help:

Translations will both be a new Work and a new Edition for it.

Elsewhere (VIAF, Open Library, Wikidata…) translated works stay the same work.

See i.e. Editing Open Library: What is the difference between a “work” and an “edition”?:

Work and edition are both bibliographic terms for referring to a book. A “work” is the top-level of a book’s Open Library record. An “edition” is each different version of that book over time. So, Huckleberry Finn is a work, but a Spanish translation of it published in 1934 is one edition of that work. We aspire to have one work record for every book and then, listed under it, many edition records of that work. The “work” data should reflect the original language, if that is known. To connect editions and translations to the work, you need to put the title of the “work” in the upper Title box in the editing page for the edition. The title of the edition or translation then goes into the Title box under “This edition.”


#2

In the FRBR one has defined expressions as something between works and manifestation (=editions). Expressions seem to be what BB works are at the moment.

So Anna Karenina by Leo Tolstoy is a work, the english translation (or a specific translation) is the same work, but a different expression. A manifestation is an edition (with an ISBN). An item is a physical book, which is not covered by BB.

IMHO it would be better to change the BB definition of works to the standard bibliographic definition. I don’t know how useful expressions are. They seem to be used nowhere.


#3

I would agree with this if we separated the expressions (translations) from them, but I’m not sure how much that extra complexity will help. In any case, I would definitely want to see different translations separately, and have, for example, two different translations of The Tempest be clearly separate works/expressions/whatever we decide to use.

Since expressions could be added later just by converting the works with the appropriate relations into expressions, this doesn’t seem very problematic at the moment. In the short term, I’d just stick to works, and decide whether the FRBR-work level IDs should be added to all relevant works, or just the original.


#4

The point in FRBR works (or library works) is to group together all editions of i.e. The Tempest, all English language editions and all translations. At this moment in time the English language editions and the translations would be completely different works in BB. The only thing they have in common is the author. (!)

The BB works would be:

  • The Tempest (original text)
  • The Tempest (Estonian translation by person A)
  • The Tempest (Estonian translation by person B)

So there should be a way to group them together. That’s what libraries use the term work for.

The FRBR expressions (BB works) are not very useful IMHO. You could better use the translator and language fields in the edition record. (If there is no translator field, it should be added.)

IMHO the release group is also not useful for books. I think it’s from MB and would be used to group together hardcover, softcover and ebook editions into one group. But it would be much better to use the publisher field in the edition record for that.

There was a GSoC project to import data from Open Library into BB. I think this import would be a bit catastrophic, because OL uses a different definition of work than BB. So there would be a mix of both definitions of works in the BB database after the import.

tr;dr BB should do what everyone else has agreed on for many years and use the library definitions works and editions and skip the release groups.


#5

It’s not quite true that they’d only have the author in common - that’s what relationships are for. You relate Estonian Translation A to Original Text with “ETA is a translation of OT”. As such, and as long as the relationship is stored, we have a clear way to indicate the info that also allows us to eventually, if desired, implement expressions as a separate level and automatically transform all works linked with “X is a translation of an Original Text” into different expressions of the same work.

A translator is not a characteristic of an edition, it’s a characteristic of an expression (or a translation work, if we store expressions as works). It makes no sense to re-enter it for every single edition of the translation, when we could just store it at a higher level. Entering it at the edition level also would block us from grouping the translations in any useful way - by having a work or expression you can easily find all the editions of a particular translation.

I mostly agree with this not being as useful for books as it is for music, and if we were to drop one thing this is the one I’d be most ok with dropping, especially since it has moved from a fairly low-level idea (“group together hardcover, softcover and ebook editions” as you said) to something more wide (group all editions even if they have different introductions and whatnot) that starts feeling kind of “Work2”.

I agree that makes the mapping trickier. That said, OL’s translations and works are enough of a mess as it is that it might not make much of a difference whether we try to map for it or not. The main issue here is to make sure that when importing the OL results into BB itself (which should always require human input) the BB user checks that the work / edition being linked is correct according to BB definitions.

The problem is that a) not everyone is using the same library definitions and b) the most common ones are not particularly good at the work levels (because most libraries pretty much ignore the work level to begin with). I agree we should look at what others are doing, and I agree FRBR is actually pretty ok. But nobody is using FRBR properly to begin with (with translations, the general decision seems to be “we’ll completely ignore them” - as you said, almost nobody is using expressions).