What would be an import format?

If I want to donate data. In what format should I prepare it?

First off, thanks for wanting to contribute!

As long as the data is public domain (CC0 license), we can accept it in any format. CSV or JSON would probably be preferred though, since they’re relatively simple to read in Python, which is likely what we’d use to process it at this point.

If it’s not explicitly public domain, we’d have to work out with Rob whether we’d be able to use it. Although factual data can’t be copyrighted, there are odd laws to do with copyright of whole databases that we need to avoid being stung by.

It could also be Dublin Core XML if you like, I am writing now a parser for it.

Hello!

We’re small special library in Witten, Germany with somewhere between 2000 and 3000 books. We would like to import our data into BookBrainz.

Our catalogue data is well-formed XML and linked to Wikidata (authors, editors, publishers…) and OpenLibrary (works, editions, covers; authors and editors where there is no Wikidata qid). Here is an example entry:

  <medium>
    <dc:creator href="https://de.wikipedia.org/wiki/J%C3%BCrgen_Habermas" wikipedia="de:Jürgen Habermas" wikidata="Q76357">Jürgen Habermas</dc:creator>
    <dc:title>Erkenntnis und Interesse</dc:title>
    <subtitle>Mit einem neuen Nachwort</subtitle>
    <dc:publisher href="https://de.wikipedia.org/wiki/Suhrkamp_Verlag" wikipedia="de:Suhrkamp Verlag" wikidata="Q301609" website="https://www.suhrkamp.de/">Suhrkamp Verlag</dc:publisher>
    <placeofpub href="https://de.wikipedia.org/wiki/Frankfurt_am_Main" wikipedia="de:Frankfurt am Main" wikidata="Q1794">Frankfurt am Main</placeofpub>
    <dc:identifier>invalid-263</dc:identifier>
    <dc:date>1973</dc:date>
    <edition>1</edition>
    <dc:relation reltype="isbn">3-518-07601-9</dc:relation>
    <dc:relation reltype="openlibrary">OL1414402W</dc:relation>
    <dc:relation reltype="openlibrary">OL5500214M</dc:relation>
    <cover href="https://covers.openlibrary.org/b/id/8231950-M.jpg" openlibrary="8231950"/>
    <pages>419</pages>
    <about href="https://de.wikipedia.org/wiki/Erkenntnis_und_Interesse" wikipedia="de:Erkenntnis und Interesse" wikidata="Q1355022"/>
    <series position="1">Suhrkamp-Taschenbücher Wissenschaft</series>
    <tag href="https://de.wikipedia.org/wiki/Philosophie" wikipedia="de:Philosophie" wikidata="Q5891">Philosophie</tag>
    <mediatype>Buch</mediatype>
    <dc:language>de</dc:language>
    <created>2018-02-14</created>
    <updated>2018-08-24</updated>
  </medium>

Is there a way we can import out data? It would be great if we could do it ourself. XSLT and shell programming is no problem. Other thinks like Python or node.js would be more difficult for us.

It would also be great, if we could reimport our data not only once but i.e. monthly without producing duplicates.

Hi GLBW!

Thanks for looking into this, I for one would love to see your collection added to BookBrainz.

Unfortunately we are not currently ready to import records.
The project has been in redevelopment and we do not currently have an import tool or an API that you could use to import.
However, it is very high on our roadmap, and we are aiming to tackle it in the second half of 2019.

I don’t have specifics to offer about the import process, but over time we will support different record formats.
We will also have an HTTP API that you will be able to use directly to import records.

I will get in contact once we start working on imports so that we can clarify the process and try out with a subset of your collection.

4 Likes