Rendering graphs in Acousticbrainz

#1

While working on the statistics and data description idea I faced an issue, regarding the rendering of a large number of data points.
Currently what we do is we store the points in a table and feed them to a client-side plotting library(Highcharts).
But for a large number of data points(for drawing box/whisker plots), this would mean eating up a lot of bandwidth. I thought a possible solution to this could be to create scripts that would save SVG images of these graphs and instead of the data points we would query these images. ie, server-side rendering of graphs.

Should we consider using this approach?

#2

Thanks for raising this issue.
It looks like Highcharts supports many different types of graphs, including box/whisker: https://www.highcharts.com/docs/chart-and-series-types/box-plot-series. From the documentation on this page it doesn’t look to me like the amount of data that is sent to the client will be very large, so I think that this is a huge problem.
However I think that there’s an additional consideration similar to your question and that is the appearance of the graphs. I think that it’s important that we have good looking graphs, and I don’t know if Highcharts allows a high level of customisation. As an example of graph customisation, look at the demo data graphs that we did for AcousticBrainz:

We used some of the points from this data visualisation workshop in order to make the graphs look better

I think that it would be a good idea to check if highcharts allows us to perform this kind of customisation of the graphs. If it doesn’t, then I think that switching to a different graphing tool would be a good idea.

Using SVG is a good idea, but PNG may not be much larger in filesize either. It looks like SVG works in all browsers, so this could be great to give us some really high quality, good looking graphs: https://caniuse.com/#feat=svg

Here are some additional options for how to serve graphs:

  1. Use a JS based graphing tool (Highcharts, d3.js/plotly.js)
  2. Use a python graphing tool to generate files and serve them (this has problems if the files are deleted for some reason)
  3. Use a python tool (matplotlib, seaborn) to generate image files directly at an API endpoint. This could allow you to have a relatively generic endpoint and add some parameters to allow you to set labels/sizes/data if you wanted. You would have to check how quickly you could generate this image. It should be almost as fast as just returning data to a javascript tool
  4. Do the above, but cache the result of the API call somewhere (redis?) so that most calls to the API endpoint return quickly

Each item builds on the previous, one. I think that 4. is a good option, but it’s the most work. I’d consider it only if the JS tools don’t provide enough flexibility in the way that we want to generate the graphs.

1 Like