TL;DR ==> Skip the Introduction and scroll to “The Actual Tests”
One of the cooler (to me) features of my PopRock app is it combines related artists for some of the charts–if the same artist is in multiple bands, for example, like Ronnie James Dio was in Dio, Elf, Black Sabbath, Rainbow, and so on. That was fun to work on and most triumphant when I finally got it to work.
MusicBrainz and LastFM, however, have provided me with a somewhat similar challenge. In my first post about using the MusicBrainz and LastFM APIs, I mentioned that Alice Cooper has two MBIDs–one for the band (1964-1975 according to MusicBrainz) and another for the lead singer cum solo artist (or is it “solo artist née lead singer”?). I’ve stored those two MBIDs, along with several others, in a python list that I loop through to get data about multiple artists. I wasn’t yet ready to store this data in my database, so I was saving each artist’s daily data in a JSON file (which Python makes so freaking easy, by the way!) like so:
artistNameFor_file_name = artistName.replace(‘ ‘, ”)
dateFor_file_name = time.strftime(“%m-%d-%y”)
artistJSON = json.dumps(artist, indent=4)
f = open (‘data/’ + artistNameFor_file_name + ‘_‘ + dateFor_file_name + ‘.json‘, ‘w’)
It works totally awesome — even for artists with unusual characters in their name such as Mötley Crüe. It doesn’t work so well for artists with exactly the same name such as Alice Cooper and Alice Cooper. In the latter case, data for the second Alice Cooper overwrites the file and data for the first Alice Cooper.
I could use the ‘a’ (instead of the ‘w’) argument to append it but for it to merge the data how I want it* isn’t really worth the time and effort it would take to write that script. What I’ll do, instead, is write the PHP for putting both sets of data where it belongs in my MySQL database which I have to do anyway. What I noticed while writing that previous post was LastFM isn’t a pretentious wiener like the guys who work for Championship Vinyl in High Fidelity (and MusicBrainz contributors). LastFM uses only one of those MBIDs for Alice Cooper. I just have to test both MBIDs again to see which one — then I can write … wait … I could add some … other string from an artist’s data to the filename … OR I could include the time in addition to the date … and that might be helpful for … something else as well. But, as I said, I need this PHP written as well.
Back to the primary topic:
I need a script (and maybe a … are they called “lookup” tables?) that knows to put data from both Alice Cooper MBID … crap … no, wait … because these are hard, quantifiable (yes, I know I’m mis-using that word … be thankful I didn’t say “mis-abusing”) numbers I can add them! That’s the whole purpose behind getting LastFM data in the first place!
I am hoping Joan Jett is equally easy. For some reason, MusicBrainz contributors have her under Joan Jett and Joan Jett and the Blackhearts. I have no idea why. She’s never done anything without the Blackhearts. This is not a Tom Petty and Tom Petty & the Heartbreakers situation … aw, man! … I also need to deal with Tom Petty.
Spotify has one artist id for all Alice Cooper albums as well as Joan Jett. They have different, separate ids for the two Tom Petty entities which I’m okay with (and I’m sure Spotify is relieved). There are other issues with Spotify both odd but whatever as well as inaccurate but … that’s a whole different, well, issue.
The Actual Tests
TEST #1 – Part 1
First, I’ll demonstrate the “problem” by showing the response to a request to MusicBrainz for the Alice Cooper group’s Release Groups (a list of albums containing all the different releases–like other countries, etc.–of that album).
OMG! I just noticed there is a “type” property with a “Group” value! Woo-hoo! That is going to make everything sooooo easy.
I notice, with a smile, this MBID has a “type” of “person” key/value pair.
I have to note that the Alice Cooper discography Wikipedia page combines solo artist and group lists for studio albums, live albums, and compilations but separates group and solo artist singles. Alice Cooper is just a mess. 😉
Important part: Each MBID has 25 albums for a total of 50. Yes, I read all of that and know each MBID has a different list of albums.
TEST #1 – Part 2
Artist Info from LastFM using the person vs band MBIDs
The LastFM response using the person MBID gets me artist info for the same MBID.
The LastFM response using the group MBID redirects me to the artist info for the person MBID and the data is the same — note the listeners and playcount values.
Moral: I only need to use one MBID for getting info from LastFM — the person MBID.
Let’s try Joan Jett.
I’m amused they have hometowns (“begin_area”) for both Alice (Detroit) and Joan (Wynnewood) but for the Alice Cooper group (Pheonix) and JJ & the Blackhearts (Los Angeles) they have the city in which the band was formed.
Joan Jett & Company isn’t quite as simple with LastFM.
Joan Jett the person is, apparently, similar to neither the artist who inspired everything about her (Suzi Quatro) nor her contemporary, Pat Benatar. Okay.
JJ & the BHs are, unsurprisingly, similar to her old band The Runaways and, well, Lita Ford (lead guitarist for The Runaways).
Neither “solo” or with the BHs (and those should be, in truth, the same MBID) is similar to Evil Stig (a band for which JJ sang 100% the lead vocals, not just some as she did with the Runaways).
Most importantly and irritatingly, the MBIDs have different values for listeners and playcount. I can still simply add them but this requires me to fetch each MBID in Last FM separately (for the time being, while I’m using my Python script to put the responses into JSON files).
Update: I’ve made a little lookup-ish table.
I know I don’t need the name columns, but it helps me. This is for artists that are identical in the real world to make sure data from LastFM goes where it should — Alice Cooper has all Alice Cooper stats, albums, and tracks and Joan Jett gets all her stuff.
Speaking of Joan Jett, I am about to–any second now–make a lookup-ish table for related artists so I don’t need my “manual” group charts below.
On a related note, I’ll make a “related albums” table for artists who were only on some of a band’s albums — for example, any Dio-related query knows to get only those Rainbow and/or Black Sabbath albums for which he was the lead singer.
An issue that will still remain is albums for which an artist is only on one or some of the songs:
- My albums table won’t accept albums if the artistID exists in the artist table which means neither The Rocky Horror Show nor The Rocky Horror Picture Show get inserted because they’re cast albums with various singers–Meat Loaf‘s artistID won’t match the artistID for those albums and “Various” or whatever isn’t in the artists table.
- Ted Nugent‘s Free for All album doesn’t display for Meat Loaf queries despite Loaf singing some of the songs — he isn’t the “artist” for the album. Hmm … I should look and see if he shows up in any of properties for that album in any of my data sources … my goodness but this app is becoming wicked cool wicked fast!
*Once I increase my MongoDB and CouchDB knowledge, perhaps it won’t be such a time-consuming task.