About Import other open databases project idea

Legend11 · March 7, 2023, 9:16pm

Hi Myself Mrigesh Thakur ( GitHub :- Legend101Zz (Mrigesh Thakur) · GitHub) ,I’m currently a 2nd-year student pursuing a degree in Mathematics and Scientific Computing at NIT Hamirpur. I am a tech enthusiast who has a passion for developing applications using the power of nodeJs.

I came across the “Import other open databases “idea and was very keen on solving the issue proposed in the project.

I had been trying to get a better understanding of the bookbrainz project and have tested it both by revising books and adding authors and from a development side by setting it up locally and understanding the codebase.

I had been doing some research on the above mentioned project and was also able to get some valuable insights from the following discussion of the issue (Import data from bookogs and comicogs).

I had thought of two approaches to deal with the problem and wanted to know how appropriate and relevant they are and if I am on the right path of thinking.

I also want to have a generalised solution for the project that would make it possible to import data from any existing database to BookBrainz (and not the only mentioned records).

Using JSON file:- The major issue that I found and was also discussed in the above-mentioned thread was to deal with the relations in the database and naming of the data (and that of bookbrainz schema )and prevent any redundancy of data.

One line of thinking that I was thinking of was creating a JSON file that would create all the data that is there in the records. Then when we query through the json file to store data in our current database, we set the values of the fields and match them to their corresponding values in our bb database. Creating new entries(also checking for data redundancy) from them and “marking an imported tag “ on the newly created entries.

Making a temporary database :- Another line of thinking was to create a temporary collection , storing both our current database and the record database in it as sub-collections and then setting the field values (that we want to match ) and then running a query to get the new entries to our database and deleting this temporary database when our task is completed .

So this was my line of thinking… I would want your views on this and would also like to know if time is a factor we have to consider while building it ( because it is a one-time import , so I did not weigh that factor much ) and how could I optimse these approaches and should I give it a try, on small databases for sake of testing both the approaches ?

pbryan · March 7, 2023, 11:03pm

Hi Mrigesh, welcome to the BookBrainz community. I replied to our messages in IRC, but you had left.

I would suggest we need some facility to hold “queued” import records from external sources, which an editor can then copy and pre-populate an entity within the user interface. The editor can then revise the entry, and submit it. Upon submission, the queued import record would be marked as completed (maybe automatically, maybe explicitly by the editor). Something like that.