GSoC 17: Book Reviews


#1

Name: Akash Nagaraj
Nickname: grassknoted
IRC nick: akashn97
Email: grassknotedl@gmail.com
GitHub: grassknoted
LinkedIn: Akash Nagaraj
#Proposal

##Overview
CritiqueBrainz has a great, growing community of passionate music lovers, many of whom are also passionate readers. To get feedback on books and from the community and make it available to others, just like we currently do for release groups, events and places would be a great addition to CritiqueBrainz, keeping in mind possible future support and extensions (such as a recommendation system).
Since we have already have the necessary metadata from BookBrainz by accessing its database directly, it reduces the need for an entirely new book database, and the book reviews could be appended to the existing Review database.
This GSoC proposal aims to extend the existing review system to a Book Review System.
The current CritiqueBrainz project structure is separated into 3 main packages: Data, Frontend and Web Services

I would propose the structure, where an additional database - the BookBrainz database would be connected to the data package, to get necessary metadata from the BookBrainz database, and save reviews in the existing review database. There is thorough explanation of this. This would further lead to the need for extension of the CritiqueBrainz backend to support Book Reviews, as show above.

##Deliverables
The following are the deliverables, on which the project could be marked as successfully completed:

  1. Extension of the current Review Database to support the additional load of the Book Reviews.
  2. Changes to the web interface to incorporate the Book Review System with a search based on:
    ○ Book Name
    ○ Author Name ( a list of books by the author is returned)
    ○ ISBN
    ○ ISBN-13
    ○ Amazon Standard Identification Number
    There are the search parameters that are included initially, more will be added.
  3. Module to access the BookBrainz Database directly and get metadata required for search and reviews, etc.
  4. A WYSIWYG Markdown editor, SimpleMDE extension for the review text editor, which will be stored with the markdown syntax.
  5. Connect every book review to the Review Database.
    #Project plan and Implementation Details

The project consists the following main parts:

  1. Extension of the current Review Database [CB-240]
  2. Search System
  3. Changes to the backend to support the Book Review System
  4. Changes to the Front End Web Interface
  5. Augmenting reviews to Review Database
    The above are in chronological order of planned execution.
    ##Phase 1 (May 30 - June 30)

###Extension of the Current Review Database
CritiqueBrainz uses PostgresSQL DBMS to save all the reviews and user data, with the following schema:

Since, the same database is used to review release groups, events and places, I believe we could continue using the same database for book reviews as well, without the need for an additional database.

Connecting to the BookBrainz Database:

   with open(sql_file_path) as sql:
       connection = engine.connect()
   connection.execute(sql.read())
       connection.close()

###Search System
Implement a Search system to find a specific book to review. The following search parameters will be used:

  • Book Name will query the BookBrainz Metadata for all books names with the same name as the entered query text, or book names that are a superstring of the entered query text.

  • Author Name will query the BookBrainz Metadata for all the Authors whose names are superstrings of the entered query text.

  • ISBN will be a separate function that accepts a limited (limiting will be done in the frontend) 10-digit ISBN and raises an exception for any other string length, or non-numeric character.

  • ISBN-13 will be a seperate function that accepts a limited (limiting will be done in the frontend) 13-digit ISBN and raises an exception for any other string length, or non-numeric character.

  • Amazon Standard Identification Number finds books based on the Identification Number provided by Amazon. I have chosen this parameter as Amazon has indexed close to every book, and this is similar to ISBN, but is more prevalent in the current context as most of the book-buying is done on Amazon.

In the case where, a user enters multiple fields and the fields don’t have a single matching query, no results will be returned. The results will always be a superset of the entered search query.

##Phase 2 (July 1 - July 28)


###Changes to the Backend to Support the Book Review System
The backend of CritiqueBrainz must be linked to the existing database for to save the reviews, and also, an additional connection must be created between the BookBrainz database and the data module in CritiqueBrainz for retrieval of book related data.

The data module serves as an interface between the database and the CritiqueBrainz backend.
The following image is an accurate description of how CritiqueBrainz accesses the database:

To extend the backend to support the new Book Review system:
● Metadata from the BookBrainz database must be accessed.
● From this metadata, a corresponding book must be fetched.
● A new entity, with a new review_id must be created.
● This new entity will be associated with a corresponding book, with a BBID.
● The review is then processed, and a spam report is generated and is associated with every individual review.
###Changes to the Front end Web Interface
As you can see below, almost all the front end has been designed already on the local instance of CritiqueBrainz that I have up and running, but I haven’t done any of the backend integrateion, therefore I have used hard coded values for the examples shown below.

The front end of the Book Review system would include the following changes:

  1. Changes to Browse Reviews: The browse reviews tab will have the following changes:
    ● New Book tag, for books (like “Event”, or “Release Group”)
    ● Books are displayed alongside Release groups, Events and Places.

  2. Changes to Write a Review: The ‘Write a Review’ will have the following changes:
    ● New Book tab is added
    ● User can search for a book using Title, or Author, ISBN, or ASIN.

  3. After the Book Selection: The interface when a book is selected and a user is presented with the option to write a review and the previous reviews written by other users are displayed.
    ● Cover page of the book is displayed
    ● Option to write a review on the top-right corner
    ● ISBN and BBID will also be displayed once it is integrated with the backend

  4. Writing the actual Review with WYSIWYG Editor: The SimpleMDE editor is integrated into CritiqueBrainz and users can write reviews using the WYSIWYG (What-You-See-Is-What-You-Get) editor.
    ● Writing a review for the book with a WYSIWYG Markup Editor
    ● Details of the book such as the Title and Author are displayed
    ● The cover page of the book is also displayed if available
    ● ISBN and BBID will also be displayed once it is integrated with the backend

  5. After Submitting a Review: Once a user submits a review, the user is intimated about the review being published, and other reviews by the same user are displayed.
    ● Publication is notified
    ● Other reviews by the same user are also displayed.


    ###Augmenting Reviews to the Review Database
    Since I have done most of the front end work already, only the tasks of making it modular and integrating it seamlessly with the backend remains, so I will be able to manage even augmenting the reviews to the database in Phase 2.
    The newly submitted reviews must be added to the existing database and a connections must be established as follows:
    ● Between a review and the user who submitted it
    ● Between a review and the book (using the BBID)

The Relationship will be set up as follows:

Which could very well be done with the existing schema and database CritiqueBrainz uses to store reviews of release groups, places and events.
##Phase 3 (July 29 - August 29)


###Code Cleanup
The inclusion of reviews into the database may very well extend into Phase 3, but it shouldn’t take more than 3-4 days to complete. The next phase ideally would begin with a complete Code Cleanup of all the code I’ve written, eliminating redundancies, and making it more modular, and extensible.
###Testing
Testing will be a very crucial part of this project before we merge it into the master branch, testing will be done on a weekly basis, but the testing in this phase will be to evaluate all the code and wrap it all up before the final submission, to ensure there are no discrepancies or any bugs in the code. The front end will be extensively tested on multiple devices, of varying screen sizes and resolution, and uniformity will be ensured.

###Documenting
Even though the code will be heavily commented, I believe that well-written documentation is always a necessity, and I will take a couple of days to document the entire book review system, to make it extremely portable and extendable.

##Post GSoC (August 30 - Indefinite :D)


After successfully completing this GSoC project, I would really like to help CritiqueBrainz build a recommendation system for the books and music and incorporate a Rating system, which I believe will complement the review system very well. After all, there is way too much crowd-sourced metadata to not be excited about!

  1. Extension of the current Review Database [CB-240]
  2. Search System
  3. Changes to the backend to support the Book Review System
  4. Changes to the Front End Web Interface
  5. Augmenting reviews to Review Database

##Proposed Project Timeline


May 30 - June 14
Extension of the Current Review Database

June 15 - June 21
Search System

June 21 - June 30
Intermediary Time for Testing, and finishing previous tasks

July 1 - July 14
Changes to the backend to support the Book Review System

July 15 - July 21
Changes to the Front end Web Interface

July 22 - August 2
Augmenting Reviews to the Review Database

August 3 - August 10
Code Cleanup

August 11 - August 20
Complete Testing and Mentor Review

August 21 - August 29
Documenting, and if time permits, setting up for a recommendation system

##Possible extensions


If everything goes as planned as I am done with my GSoC project earlier than expected, I thought I could start working on the following ideas (in order of preference):
Implementing Book/Music Recommendations: Although there is a lot of software out there that already does this, with the amount of open-source metadata on our hands, we would have an edge over the others. I plan on working on a very basic version of a recommendation system, for a user.
Documentation: CB and BB, could use more documentation to make it easier for a naive user, this could be done with the help of dedicated documentation.
Implementing a rating-system for CritiqueBrainz: Since I would have good idea about the entire code structure, I believe that I could start working on a 5-star rating system that would be converted to a 0-100 system in the backend (like MB ratings).

##About Me


Tell us about the computer(s) you have available for working on your SoC project!
I have a custom built Desktop with the following specifications:
● Intel Core i5-6500K
● 128GB SSD + 2TB HDD of Storage
● 8GB DDR4 RAM
As well as a Dell Inspiron 15 5000 series, for added portability.
Both computers are dual booted; running Windows 10 alongside Ubuntu 16.04. The Desktop is additionally running Manjaro, but I mostly stick to good-old Ubuntu.

When did you first start programming?
I first started programming back when I was 13 (I’m 19 now), in the 8th grade as a part of my curriculum. In 11th grade, although I loved programming, I decided to pursue Biology, but fate always gets to you, and after my 12th grade, I got back to Computer Science, initially as an Android Developer, and then a Full Stack Freelancer, but now I’ve turned my interests towards Data Analysis and Data Retrieval.

What type of music do you listen to? (Please list a series of MBIDs as examples.)
I’m asked this question quite frequently, and my usual reply is “Good Music”, because that, in my opinion would be the most apt description of my playlist, as it is a mix of almost every genre, including Country, Rap, Death Melodic Metal.
A few of the song’s I looped until I couldn’t stand them anymore are:
Queen - Killer Queen (25238788-53d1-420a-b6c0-198357722a82)
SOHN - Artifice (709255c8-42a8-439d-a0c3-ef7f033f6458)
And currently; Selena Gomez & Kygo - It Ain’t Me (74d40e00-61ed-4e47-ade7-942ed7a0dd69)

If applying for a BookBrainz project: what type of books do you read? (Please list a series of BBIDs as example. (And feel free to also list music you listen to!)
Although this isn’t a BookBrainz project, it’s closely related to books, and being a voracious reader, I can’t resist from listing out books that I read.
I mainly prefer fiction, science fiction and true-story books. My most favourite books in the genres are (respectively):
Kite Runner (4f3c41bf-0c62-4cf3-af1c-600a96920037)
The Hitchhiker’s Guide to the Galaxy (70fe6478-e2db-4229-95c9-1553e52a8ce0)

What aspects of the project you're applying for (e.g., MusicBrainz, AcousticBrainz, etc.) interest you the most?
As I mentioned earlier, I’m very interested in Data Analysis and Data Retrieval. In addition to that, I’m an avid reader, and am always on the lookout for my next book. MusicBrainz has really helped me tag my chaotic music library, and helped me find great music I would have never heard otherwise. I personally think extending the current Review System to Books as well would be a great new addition, also, analysis of the reviews, could lead to book recommendations being a possible future enhancement!

Have you ever used MusicBrainz to tag your files?
Yes, I have. This is how I came across the MetaBrainz Organization!

Have you contributed to other Open Source projects? If so, which projects and can we see some of your code?
I used to be more of a personal-project kind of guy, and I’m relatively new to the Open Source Community, and I haven’t made too many contributions, although I’ll list out the few I’ve made:
● I made minor documentation changes to CritiqueBrainz.
● I have written a few reviews on CritiqueBrainz, and plan on writing many more.
● I have added documentation changes to the Picard Website as well.
● As a frequent user of SymPy, I’ve made contributions, for Chi-Square tests.
● I’ve been working on the front end for the book review system, and it is very close to completion, you can see samples of it above.

What sorts of programming projects have you done on your own time?
In my own free time, I try to learn new technologies and their applications. I used to be an Android App Developer (back in 2014), and freelancing Full Stack Web Developer. Now, I work mostly on Embedded Systems projects, and I’m learning about Neural Networks and Deep Learning, which I would need for Data Retrieval and Analysis projects.
I’ve made a simple text-editor, called NotePal, for Linux, and worked on text/image compression algorithms using Huffman Coding.
A project I feel would be relevant to the proposal I’m making is: A Movie-Recommendation-Website I made (but didn’t publish) using Flask. It has a 5-star rating system and uses ID3 algorithms to recommend movies to a user based on their previous likes and dislikes.

How much time do you have available, and how would you plan to use it?
I have most of my summer free, and I plan to work on the GSoC project everyday. I plan to work 6-8 hours a day on weekdays, and maybe around 4-5 hours on weekends, for the entirety of the project duration.

Do you plan to have a job or study during the summer in conjunction with Summer of Code?
I have almost my entire summer available, except during the day on weekends and during very occasional boxing matches (I represent my state at boxing tournaments). On, weekends I teach children/help out at a local orphanage. To make up for my absence during the day on weekends, I’d be working during the nights on weekends.
Although I haven’t spoken up a lot, or done a lot, I have spent a considerable amount of time getting familiar with CritiqueBrainz code, and would start my coding, as early as possible


#2

I would really like to show the work I’ve done so far, but it only allows 1 picture/post.
So, here is the proposal in PDF format: http://docdro.id/RF0hwUu
I’m very sorry for the extremely late proposal submission, my work computer was stolen, this is something I managed to put together over the last few days.
Please provide your feedback.


#3

This is one important aspect that is missing from the ticket about adding book reviews. We currently don’t have a way to connect to the BookBrainz database. Their API is also quite unstable, as far as I understand.

Not sure what can be done about all that, sadly. @LordSputnik, @LeftmostCat?


#4

You should be able to post more now.


#5

I agree with Gentlecat that directly accessing the BookBrainz database would be difficult, but we could probably connect to it from the CB server. But since you’re going from the Google cloud servers to the Hetzner servers the rest of MeB is running on, data access could be quite slow.

Another option is to just use the regular dumps of the BB database to mirror the database on the CAB server, which could then be accessed very easily. This could be done with a script in a cron job. However, a downside would be that the latest data wouldn’t show up until the databases were re-synchronized, and as the BB database grows in size, this won’t be a sustainable solution. We would want to address that on the BB side though by adding replication, in the same way that MB does currently.


#6

Thank you! :slight_smile:

Thank you for your response! :slight_smile:
I feel regular, frequent dumps would be the best was to go as for now. Since we would be accessing the BB Database primarily during searches, speed would be an important factor, and going through the CB server would slow it down significantly.

If the database synchronization was made pretty frequent, it would be the best solution to access the BB Database.
Perhaps in the near future, we could revamp the BB Database to a more stable API and this could be accessed by the Book Reviews instead of the database dumps.