GSoC 2024 - BookBrainz: New Calibre plugin

Basic Information: -

I ‘m Abdelhamid Ahmed Robaa, a junior computer science student at Sadat University, Egypt.

GitHub-Profile

Libera.Chat IRC: robaa

Email: - abdelhamidrobaa.mail@gmail.com

Phone: - +201030607541

Time zone: Eastern European Standard Time (GMT +2)

New Calibre Plugin

Introduction

Problem definition: As stated on the ideas page, Calibre, an established open-source e-book library manager, lacks a functional BookBrainz plugin, previously known as CaliBBre, due to its abandonment several years ago. The existing codebase for the plugin is outdated (8-9 years old) and incompatible with the latest versions of Calibre. As a result, users are deprived of essential features and integration with BookBrainz, a valuable resource for managing e-book collections.

Current solutions: 1. Manual Metadata Editing: Users can manually edit metadata within Calibre for each e-book, including title, author, and other details. However, this process is time-consuming and prone to errors, especially for large e-book collections.

  1. Install the CaliBBre plugin manually. However, it is observed that this plugin is not functional with the latest release of Calibre.

Solution: The solution can be described as a plugin that can do the following:

  1. Sending a Search Request to BookBrainz Website: -

    • Use the requests library in Python to send a search request to the BookBrainz website.

    • Construct the URL for the search query using the user’s input or predefined search parameters.

    • Send the request to the BookBrainz website and retrieve the HTML response.

  2. Parsing HTML Content to Extract JSON Results: -

    • Utilize the BeautifulSoup library to parse the HTML content received from the search request.

    • Identify the tag(s) containing the JSON data. Typically, JSON data is embedded within tags with a specific identifier like id=‘props’.

    • Extract the text content from these tags and parse the JSON data using the Json module.

  3. Displaying Results on the Plugin UI:

    • Update the plugin UI with this extracted information. This involves populating a QListView, QTableView, with the search results.

    • Implement user interaction features, such as allowing the user to select an item from the search results and perform more actions based on it like:

      • Update the Metadata for a book using update_metadata() function provided by calibre.

Project goals (Deliverables)

Three main tasks need to be done in the plugin, namely:

  • Interactive User Interface: Implement a user-friendly interface that enables intuitive interaction, such as selecting and browsing through search results.
  • Search Functionality: Implement search functionality within the plugin, allowing users to search for book editions by name and author directly from within Calibre.
  • Documentation and Support: Provide comprehensive documentation and support resources to assist users in installing, configuring, and utilizing the plugin effectively.
  • Metadata Enhancement: Integrate features to improve metadata for e-book files, leveraging Calibre’s functionality to enhance metadata accuracy and completeness.
  • Publish Plugin: Prepare the plugin for distribution by packaging it appropriately and submitting it to Calibre’s plugin repository through MobileRead or other relevant platforms like Github for public access and installation.

Implementation

The plugin heavily depends on the calibre module to perform various tasks, including displaying dialogs, accessing settings, and modifying content within the Calibre application window. Below are the specific processes that the plugin will execute:

Plugin Screen: -

I designed UI for the plugin using QtDesigner software shows how the user will interact with the plugin: -


Figure 1

The Search UI design is inspired by Bookbraniz’s website search interface. This interface offers three main functions: -

1. Search_by_author || Search_by_book function ()

  • Sending search requests to the Bookbrainz server: This involves establishing communication with the Bookbrainz server to submit author search queries. The function sends the author’s name as a parameter in the search request.

  • Parsing the HTML response: Upon receiving a response from the Bookbrainz server, the function parses the HTML content as shown in figure 1


Figure 2

  • Filtering HTML code using BeautifulSoup :- To effectively extract the JSON data necessary for book details, BeautifulSoup is employed to filter out extraneous HTML elements. I have created a script that scraps the query data shown in figure 3, 4.


Figure 3


Figure 4

  • Rendering output with QList component: The last step involves rendering the extracted book details using the QList component provided by Calibre. This component offers a user-friendly interface for displaying the search results, enhancing readability and user interaction.

2. Export_data_csv function (): -

This Python function facilitates the export of data to a CSV (Comma-Separated Values) file format. The function takes in data that needs to be exported to a CSV file. It opens or creates a CSV file for writing data. It writes the formatted data into the CSV file, with each row representing a record and values separated by commas.

3. Update_books_metadata function (): -

1. How Calibre store metadata?

  • In calibre, metadata is stored in metadata.db (SQLite file) in Calibre library on the host device. The file consists of many tables containing information about (“authors,” “books", "publishers” and many other information). Shown in figure 5, some many to many relationships’ tables connecting the books table with the rest of the information like book_authors , book_publishers.


Figure 5

2. How update_books_metadata() will be implemented?

  • This function heavily depends on metadata plugins offered by calibre called calibre.customize which offers two main classes MetadataReaderPlugin & MetadataWriterPlugin. These classes have all the required methods to update books in Calibre. Calibre plugins incorporate functionalities to modify book metadata, affecting both the metadata database (metadata.db) and the EPUB format metadata.

  • For example, the MetadataWriterPlugin has a function called set_metadata(mi, type) that takes two parameters, mi is a metadata object and type is a type of file or book format.

Debugging and Testing: -

Writing tests is a crucial part of the software development lifecycle. Most software products should have some kind of automated testing process whose mission is to ensure the delivery of correct functionality, the BookBrainz plugin is no Exception. Most components in the plugin will be accompanied by test cases to assert the component is properly working. Using calibre-debug, running a test script (test.py) becomes easier.
calibre-debug -e test.py running this command will run my test.py script to make sure that the plugin is running correctly. Check the calibre-debug docs.

Documentation: -

Like testing, documentation is an important part of any software product. So, the plan for documentation includes explaining data retrieval mechanisms, functions developed, reference to each component used in calibre docs. As for users, if the situation demands it, there will be a step-by-step guide (or tutorial) explaining the usage of the plugin.

UI/UX: -

UI/UX is equally important as the other sections of the project. Before the coding work commences, a suitable amount of time should be assigned to the design of UI/UX. There are several software products out there that are used for UI design and prototyping, such as:

  • QtDesigner(Recommended)
  • Figma
  • Mockflow
    Figure 1 offers a Qt Designer demo sketch for the plugin interface.

Timeline: -

Notes:

  • “Release Early / Release Often” is the philosophy followed.
  • A timeline is a flexible object, meaning that it can change according to the amount of work done and the amount demanded at any time.

Community Bonding Period

  • Socialize and interact with other community members (students and mentors)
  • Learn more about bookbrainz as an organization with a mission and vision
  • Familiarize me more with the architecture and codebase
  • Clarify ambiguous portions in the proposal
  • Understand plugin development more deeply
  • Setup the plugin project
  • Start a blog (for future reports)
  • UI design

Proof of Concept (5 Weeks)

Period    

Work

Week 1-2

Work on the Data retrieval:

    • Implement the initial functionality to main.py file to import data from BookBrainz website.
  • Implement the Parser class to process the imported data.

Week 3-4

  • Work on ui.py to Integrate BookBrainz data into the user interface (UI) of Calibre plugin.
  • Implement Search based on author and book name.
  • Design and implement simple UI elements to display BookBrainz data, such as book details.

Week 5

  • Implement basic functionality to update metadata using Calibre Plugin API.
  • Publish the plugin
  • Gather feedback, and fix bugs.

Improvement Iterations (5 Weeks): -

Period    

Work

Week 6

  • Handle Parsing Errors when query’s response is large.
  • Enable searching for editions.
  • Publish, gather feedback, and fix bugs.

Week 7

  • Handle updating metadata errors for specific book formats.
  • Enable searching for authors.
  • Publish, gather feedback, and fix bugs.

Week 8

  • Implement export as csv format functionality.
  • Publish, gather feedback, and fix bugs.

Week 9

  • Provide automated testing methods for most functions.
  • Write detailed documentation of every function, its internal mechanism, dependencies, and performance.

Week 10

  • Write a detailed step-by-step usage manual to serve as a reference for users on how to make the most out of the plugin.

About me

Why me?

  • Over the past weeks, I have spent a great deal of time inspecting the codebase of bookbrainz and trying to resolve issues.
  • I took a deep dive into the plugin’s API documentation and built a sample plugin that changes book’s cover on calibre.
  • I joined the metabrainz community and became familiar with the project’s community and mentors.
  • Contributed to BookBrainz at BB-792

Programming Skills: -

I started learning to program in grade 12. I was working with microcontrollers called Arduino, using C++. My aim was to build a robot that could go through a maze and solve it. The first two tries didn’t go well, but I learned from them. In the last competition, I got second place. That made me really enjoy programming. So, I decided to study computer science in college, even though I mostly teach myself. I am currently learning backend development using Django framework.

  • Python, Django, Web Scraping, FastAPI , Django Rest Framework.
  • Data Structures, Algorithms, Competitive Programming (with C++).
  • Java, OOP, Design Patterns.
    Competitive Programming Profiles: –
    CodeForces Handel:- arobaa23 - Leet Code Profile: - AbdelhamidRobaa

Side Projects: -

  • Personal Blog - Full-stack web application for Personal blog to publish my own articles.
    • Technologies used: - Python, Django, HTML/CSS, ORM, MVC architecture, SQL Lite, Git.
    • GitHub Repo.
  • Egy-East - desktop application for hosting movies and digital content for middle east. I was responsible for the backend part including database design, writing raw SQL queries. No ORM was used because this project’s goal was to learn both OOP concepts and SQL.
    • Technologies used: - Java, Swing, Microsoft SQL Server, Git, Object Oriented Programming.

Books I read:

  1. Introduction to Algorithms fourth edition (b96c749b-24d1-4aad-b85f-931c7f0b201d): - The first time I read this book, I felt lost. It explained algorithms using lots of math, which was tough to grasp at first. But as I kept at it, things started to click. It’s actually pretty cool to understand how algorithms work behind the scenes and figure out which one works best for different situations.
  2. The Subtle Art of Not Giving a F*ck (70cc1167-69b0-48f0-96a9-c277a2723603).
  3. Fundamentals of Database Systems (I will request to be added to bookbrainz): - It not only helped me understand the nitty-gritty of databases, like constructing schemas and DFDs, but also provided invaluable insights into implementing these schemas with maximum efficiency.

FAQ: -

How much time do you have available, and how would you plan to use it?

I will be free this summer cause of that I am searching for open-source contribution opportunity to improve my skills, I think working on this project at the summer as full time will improve many skills: -

  1. My Backend skills as I already spent a lot of time understanding bookbrainz schema which is a big project to learn from.
  2. Experience with python, my main programming language is c++. I developed full stack website using python; however this project will be bigger, and I will write many lines of code.
  3. Help readers community, Calibre is an amazing software, and it would be amazing to offer bookbrainz services to it.

Currently, I am using my ASUS TUF gaming f15 laptop with the following specs:

  • CPU: - i5 11400h (6 cores, 12 threads).
  • RAM: - 16 Gb ram.
  • GPU: - Nvidia RTX 3050 (I love playing games like Valorant and LOL)
  • 512 SSD for storage.
  • Dual boot between Windows 11 and Linux (Zorin OS).
3 Likes