GSOC 2021: Series Entity-Bookbrainz: Shivam

Personal information

Nickname: Shivam
IRC nick: ShivamAwasthi
GitHub: the-good-boy

Proposal

Project Overview

BookBrainz is missing a way to represent series. This project will implement a series entity for Bookbrainz. A series entity will be a group of related entities of similar type(work, edition, edition-group) which may or may not be ordered.

Features

• Similar to other entities, a user can enter the Name, Language, Disambiguation, Alias, Identifiers for the series entity.
• The relationship editor will allow the user to connect the current series entity to any previously defined entity with a relationship having a link phrase ‘TargetEntity is a part of CurrentSeries’.
• The relationship editor will contain one more field where the user can enter a number to indicate the position of target entity in the series.
• If a user does not enter the position, then the series will be considered unordered.

Database Changes

The proposed schema for the Series entity will be similar to other entities as each Series entity will have a Name, Sort Name, Language, Disambiguation, Alias, Identifier, RelationshipSet, Annotation.

One noticeable change in the Database is that now each relationship can have some attributes associated with it. For example, in case of series, we can add a position to each relationship which can determine the order of each entity in that series.

After a brief discussion, I also considered including an ‘Ordering Type’ as an additional property of the Series property(inspired from the Musicbrainz editor). However, I felt that this would be somewhat redundant as the user can still leave the modifier field as empty, even he selects the ordering type to be manual.

A rough diagram for the proposed Series entity schema is shown:

Tentative sql for the proposed schema changes will be as follows:

ALTER TYPE bookbrainz.entity_type ADD VALUE 'Series';

CREATE TABLE bookbrainz.series_header (
    bbid UUID PRIMARY KEY,
    ADD FOREIGN KEY (bbid) REFERENCES bookbrainz.entity (bbid)
); 
CREATE TABLE bookbrainz.series_data (
    id SERIAL PRIMARY KEY,
	alias_set_id REFERENCES bookbrainz.alias_set (id),
	identifier_set_id REFERENCES bookbrainz.identifier_set (id),
	relationship_set_id INT REFERENCES bookbrainz.relationship_set (id),
	annotation_id REFERENCES bookbrainz.annotation (id),
	disambiguation_id REFERENCES bookbrainz.disambiguation (id),
	language_set_id INT,
entity_type bookbrainz.entity_type NOT NULL
);
CREATE TABLE bookbrainz.series_revision (
	id REFERENCES bookbrainz.revision(id),
	bbid UUID,
	data_id INT REFERENCES bookbrainz.series_data(id),
	is_merge BOOLEAN NOT NULL DEFAULT FALSE,
	PRIMARY KEY (
		id, bbid
	)
);

ALTER TABLE bookbrainz.series_revision ADD FOREIGN KEY (bbid) REFERENCES bookbrainz.series_header (bbid);
ALTER TABLE bookbrainz.series_header ADD FOREIGN KEY (master_revision_id, bbid) REFERENCES bookbrainz.series_revision (id, bbid);

There will some a change to relationship table

CREATE TYPE bookbrainz.attribute_type AS ENUM (
	'Position',
    'PageNo',
    'Date'
);

ALTER TABLE bookbrainz.relationship ADD COLUMN attribute_type bookbrainz.attribute_type DEFAULT NULL

Each new type of attribute can have its own table as follows:

CREATE TABLE bookbrainz.relationship_attribute_position (
    id INTEGER SERIAL,
    rel_id REFERENCES bookbrainz.relationship(id),
    PRIMARY KEY(
        id,
        rel_id
    ),
    position INTEGER DEFAULT NULL
)

I will also create a Series View by joining all these tables.

ORM Changes

Models for series, series_header, series_data, series_revision will be created. Most of these are similar to the models of other entities.

Series data:

export default function seriesData(bookshelf) {
	const seriesData = bookshelf.Model.extend({
		aliasSet() {
			return this.belongsTo('AliasSet', 'alias_set_id');
		},
		annotation() {
			return this.belongsTo('Annotation', 'annotation_id');
		},
		disambiguation() {
			return this.belongsTo('Disambiguation', 'disambiguation_id');
		},
		format: camelToSnake,
		idAttribute: 'id',
		identifierSet() {
			return this.belongsTo('IdentifierSet', 'identifier_set_id');
		},
		languageSet() {
			return this.belongsTo('LanguageSet', 'language_set_id');
		},
		parse: snakeToCamel,
		relationshipSet() {
			return this.belongsTo('RelationshipSet', 'relationship_set_id');
		},
		tableName: 'bookbrainz.series_data',
		
	});

	return bookshelf.model('SeriesData', SeriesData);
}

Series:

	const SeriesData = bookshelf.model('SeriesData');

	const Series = SeriesData.extend({
		defaultAlias() {
			return this.belongsTo('Alias', 'default_alias_id');
		},
		idAttribute: 'bbid',
		initialize() {
			this.on('fetching', (model, col, options) => {
				if (!model.get('revisionId')) {
					options.query.where({master: true});
				}
			});

			this.on('updating', (model, attrs, options) => {
				options.query.where({master: true});
			});
		},
		revision() {
			return this.belongsTo('SeriesRevision', 'revision_id');
		},
		tableName: 'bookbrainz.series'
	});

	return bookshelf.model('Series', Series);
}

Series Header:


export default function seriesHeader(bookshelf) {
	const SeriesHeader = bookshelf.Model.extend({
		format: camelToSnake,
		idAttribute: 'bbid',
		parse: snakeToCamel,
		tableName: 'bookbrainz.series_header'
	});

	return bookshelf.model('SeriesHeader', SeriesHeader);
}

Models for different attributes will also be created.

Server Side Changes

I will create a function to initialise SeriesRoutes .

Location: src/server/routes.js

function initSeriesRoutes(app){
    app.use('/series', seriesRouter);
}

function initRoutes(app){
    initSeriesRoutes(app);
}

The actual routes for the series entity:

Most of the code here will be similar to other entities as we will be using the same saving mechanism as other entities. Some of the example code is shown here:
Location: src/server/routes/entity/series.js

function transformNewForm(data) {
	const aliases = entityRoutes.constructAliases(
		data.aliasEditor, data.nameSection
	);

	const identifiers = entityRoutes.constructIdentifiers(
		data.identifierEditor
	);

	const relationships = entityRoutes.constructRelationships(
		data.relationshipSection
	);
    
	return {
		aliases,
		annotation: data.annotationSection.content,
		disambiguation: data.nameSection.disambiguation,
		identifiers,
		note: data.submissionSection.note,
		relationships,
	};
}

const createOrEditHandler = makeEntityCreateOrEditHandler(
	'series', transformNewForm, []
);

const mergeHandler = makeEntityCreateOrEditHandler(
	'series', transformNewForm, [], true
);


const router = express.Router();


router.get(
	'/create', auth.isAuthenticated, middleware.loadIdentifierTypes,
	middleware.loadLanguages, middleware.loadRelationshipTypes,
	(req, res) => {
		const {markup, props} = entityEditorMarkup(generateEntityProps(
			'series', req, res, {}
		));

		return res.send(target({
			markup,
			props: escapeProps(props),
			script: '/js/entity-editor.js',
			title: props.heading
		}));
	}
);

router.post('/create/handler', auth.isAuthenticatedForHandler,
	createOrEditHandler);

router.param(
	'bbid',
	middleware.makeEntityLoader(
		'Series',
		[],
		'Series not found'
	)
);



router.get('/:bbid/delete', auth.isAuthenticated, (req, res) => {
	_setSeriesTitle(res);
	entityRoutes.displayDeleteEntity(req, res);
});

router.post(
	'/:bbid/delete/handler', auth.isAuthenticatedForHandler,
	(req, res) => {
		const {orm} = req.app.locals;
		const {SeriesHeader, SeriesRevision} = orm;
		return entityRoutes.handleDelete(
			orm, req, res, SeriesHeader, SeriesRevision
		);
	}
);

Frontend Changes

In client/component/layout, I will add the link for Add Series.

Will also add an icon to ENTITY_TYPE_ICONS after having some discussions.

I will also create a SeriesPage and SeriesTable component for displaying the Series entity.

Entity Editor:

The user will select the Entity-type of the series in the Series-section of the entity editor. This would later be used to filter out the entities. I think we will probably have to place the Series-section before the relationship-section, unlike the other entities where the relationship-section comes first. I think I will be able to sort this out once I actually get started on the project.

A major portion of this project will involve leveraging the current entity-editor and the relationship-modal to add attributes to relationships. For example: In the case of Series Entity, the relationship modal should allow the user to enter the position of each relationship.

The relationship-editor modal is expected to something like this:

The exact strategy for doing this has to be discussed well before implementing, so that it can be scaled nicely for other entities. I haven’t finalised the exact strategy that I will use to tackle this, but will give an overview of one such method:

Currently, our relationship-editor makes use of state which looks something like this:

state = {
    relationship,
    relationshipType,
    targetEntity
}

We can add some more keys to this state object like this:

state = {
    relationship,
    relationshipType,
    targetEntity,
    attributeType,
    attributes
}

I will make use of some helper function to get attribute fields for a particular attributeType using some mapping.

const attributeFieldElement = getAttributeField(attributeType)

Here, attributeFieldElement is an element which will display the attribute-fields for different relationship-types in the Relationship Modal.
attributes is an object which stores the attribute value entered by the user. (For example: for Series, it will contain the position of the relationship).

Now we will make some changes to our eventHandler functions:

handleEntityChange = (value: EntitySearchResult) => {
		this.setState({
			relationship: null,
			relationshipType: null,
			targetEntity: value,
            attributeType: null,
            attributes: null
		});
	};

handleRelationshipTypeChange = (value: _Relationship) => {
		this.setState({
			relationship: value,
			relationshipType: value.relationshipType,
            attributeType: getAttributeType(value.relationshipType)
		});
	};

Here, getAttributeType() will return the corresponding attribute type for a particular relationshipType by using a mapping mechanism. If there is no attribute associated with this type, it will return null and hence, no extra field will be there in the relationship modal.

Then, we can define a function renderAttributeFields(): In this function we will make use of our attributeFieldElement to render the input fields for the particular relationship-type.
Input to these fields will then be handled by an eventHandler and its values will be added to
attributes object.

Now when we add this relationship by using our action-creator, each relationship row in relationships object will also have a key named attributes and attributeType.

In the backend, when we transform our formData in the transformFormData() we can do something like this:

function transformNewForm(data) {

	const attributes = entityRoutes.constructAttributes(
		data.relationshipSection
	);
}

In our entityRoutes file we can create a function like this:

export function constructAttributes(relationshipSection) {
	return _.map(
		relationshipSection.relationships,
		({rowID, attributeType, attributes}) => ({
			relationship_id: rowID,
			attributes,
            attributeType
		})
	);
}

Now, we can save this attribute in its corressponding model by again doing some sort of mapping:

const AttributeModel = getAttributeModel(attribute.attributeType)

The getAttributeModel() will do mapping for getting the corressponding model for each attributeType.

Basically the name of model for a particular attributeType will be RelationshipAttribute${attributeType}.

This is a very high level overview of one approach, and I would like to improve it by hearing the suggestions of the mentors.

TODO:
Solidify an approach for the relationship-attributes by discussions and suggestions.

TIMELINE

I will admit that I dont have much experience with BookshelfJS, but I will be able to familiarize myself by the time we start working on the project. Also, I have very little experience in writing tests with Chai and Mocha, I am going to work on familiarizing myself with them in the next few weeks.

Pre-Community Bonding Period:
I would like to give a more solid shape to the approach for tackling the relationship-attributes.

Community Bonding Period:
I would like to spend this time playing with BookshelfJs and exploring it. Also, I would like to setup our database schema changes and start writing models for ORM.

Week 1 and 2:
I would like to write the ORM models and tests in the first two weeks. All tests will be written alongside the models.

Week 3 and 4:
After setting up our database, I would like to start working on our relationship-editor.

Week 5:
I would like to start writing our server routes in Week 5.

Week 6:
I would like to setup our entity-editor. By week 6, we should be able to submit the form.

Week 7 and 8
I would like to finish writing our server routes and complete the entity-saving mechanism.The merge and delete functionality will also be done in this period.

Week 9 and 10
Finally, I would like to setup all the display pages for the entity.

STRETCH GOAL
If I have time, I would like to work on adding more functionality to the modifier field by personalising it for different entities. More specifically, I would like to work on the issue ‘[BB-289]: Allow specifying order (and page numbers?) of works in editions’. I think this would be a great use-case for this modifier field.

Detailed information about yourself

  • Tell us about the computer(s) you have available for working on your SoC project!
    I have a Legion Y530 laptop with i5 processor and 8gb RAM.

  • When did you first start programming?
    I started programming when I was in 6th standard in school. We started with JAVA.

  • What type of music do you listen to? (Please list a series of MBIDs as examples.)
    I love Twenty-One Pilots, Pink Floyd, Led Zepellin.

  • What aspects of the project you’re applying for (e.g., MusicBrainz, AcousticBrainz, etc.) interest you the most?
    Like I said, I love reading.

  • Have you ever used MusicBrainz to tag your files?
    No :frowning:

  • Have you contributed to other Open Source projects? If so, which projects and can we see some of your code?
    I am relatively new to open source.

    • If you have not contributed to open source projects, do you have other code we can look at?
      Yeah sure. I learnt Express, Node, React and Redux by doing some hands-on projects. One of them is here: GitHub - the-good-boy/DevConnector
  • What sorts of programming projects have you done on your own time?
    I practice competitive programming in my spare time.

  • How much time do you have available, and how would you plan to use it?
    I will be able to give about 5 hrs/day during the course of the project. I will try to put in some extra hours if the necessity arises.

  • Do you plan to have a job or study during the summer in conjunction with Summer of Code?
    I will have my summer vacations during the time, but yes, I will also do some study in conjunction with SoC.

2 Likes

@mr_monkey, this is the initial draft for my proposal. I would love to hear your review and suggestions to improve this proposal.

You’re off to a good start !

Overall I think you have a good understanding of the project and where it all fits in the codebase.

A few initial remarks:

  • We’ll need an entity_type column in the series_data table like we do for example in the user_collection table, in order to be able to fetch and restrict the type of entities in a series
  • The modifier idea needs to be expanded to allow for different use cases. You are right when you say it can be used for other relationships, but we will need different types of attributes (that’s what I would call those), not just an int.
    For example ordering for series and the ticket BB-289 you mention, but for other relationship types we will want a start and end date (Author X was employed by Publisher Y from date_start to date_end), just to give an example.
    So I think we will need a more complex setup with more schema changes: tables for extra relationship attributes (one table per type e.g. relationship_order, relationship_date, etc.), a column in the relationship table to point to an attribute row, and a way to know which relationship attribute table to fetch from depending on the relationship type (if relationship type = “entity belongs to series”, then fetch rel. attribute from relationship_order table).
    I know this is a very short outline, and I’m happy to discuss the structure of this in more detail. And implementing other rel. attributes is a good stretch goal.
  • Looking at the timeline, I think the work you carved for week 1 will take more time than that, while the work for weeks 7&8 will probably be done in a week (both deleting and merging are already set up for all entities; I don’t think you’ll find a lot of special cases for either)
    I would also be more comfortable if the tests were planned all throughout as you code each component, rather than being left to the end of the project. In my experience there’s never enough time for tests otherwise :slight_smile:
2 Likes

Alright. I guess I can handle this in the series-section of the entity-editor.

I understand what you mean. Depending on our use-cases, we can make tables like relationship_order, relationship_page, relationship_data_start, etc. I guess we can come up with a mechanism to point to the right table for different relationship types. I am excited to hear and discuss more implementational details of this (if you already have an approach in mind).

I understand that the database changes will be relatively more complex than what I initially had in mind. I will make the required changes soon. Thank you!

1 Like

@mr_monkey I have made some changes in the proposal and would like to hear your thoughts on the same. :slight_smile:

Thanks for those modifications @ShivamAwasthi !

At first glance the new relationship attributes look pretty good.I think ultimately relationship_attribute_position and relationship_attribute_page could be combined into relationship_attribute_numeric, since they are both just integers. But you’ve got the general idea of it and that’s good as an example.