I’m attempting to work with the recordings
JSON data dump but am running into a problem.
I’ve downloaded the 20240925 data file (recording.tar.xz
) from this page and unzipped it using 7-Zip. That results in a ~1.1GB data file along with the relevant README etc. files.
Using the pandas
Python package to read the data file, however, I’m only getting 125,693 lines(/recordings) - when this page leads me to think I should be getting 33m.
Is anyone able to provide suggestions on what I might be doing wrong - or confirm that the JSON data dump does contain all the records it’s supposed to?
Happy to post more details including the Python code I’m using if that’s helpful.