I find it fascinating when you think numbers seem to say one thing but beg a question that’s answered with further investigation. Often, statistics are delivered in the simplest way, out of context, and presented in a way to support what the presenter wants to say, not what the data says.
For example, in the last election cycle, a list of questions was being passed around from one group of pollsters to the other. I know this because the first time I heard them, I complained that they were poorly written and their data would be meaningless as a result. Several weeks later, another group called me and when I told the young lady I’d like to re-state what I told the earlier pollster, she said it was impossible I’d heard them before because they’d just started calling people that day. I have to paraphrase from memory and, because I’m naturally honest, I won’t be able to recall it as horrible as it actually was … it was something like, “Do you think Obamacare should be repealed?” or “What are your feelings about Obamacare?” … both with equally bad Likert Scale answers. I told the pollsters it was obvious their poll was written so that no matter who they called, their “data” would say most people hated the Affordable Care Act (also notice they called it “Obamacare”). The truth is nobody likes it but there’s no way whoever presented the data was going to mention how many people thought it was “not enough” instead of “too much” — and they didn’t bother asking.
Anyway … on the other hand, sometimes data is good but could be great in the hands of a super genius data scientist like myself.
I collect data from the last.fm API — Listeners and Playcounts for artists, albums, and tracks. While the data is yummy straight from the oven, there’s a lot of mixing and kneading I can do before baking it that makes it yummier.
Figure 01 shows Radiohead has over 7 million more followers than the artist in second-place, Queen. The difference between 2nd and 3rd place (The Cure) is even bigger with approximately 11 million less. First of all, I can’t believe Radiohead is so popular. Second of all, when I first saw their number of listeners, I immediately asked myself if that resulted in an equivalent landslide of plays.
Maybe it’s cool to like Radiohead so they have a lot of listeners, but nobody actually listens to their music. You know, the guy with vinyl editions of their albums on his wall but if he heard a Radiohead song on the, you know, radio, he wouldn’t recognize it because he’s never really listened to them.
The top six and the bottom six stayed exactly the same as did Meat Loaf and the Amboy Dukes. The rest didn’t change much but the big winner in the set of artists I track was Saxon, gaining 4 slots (which, because these artists are merely the union of bands I’m curious about and the Rock and Roll Hall of Fame Class of 2019 still doesn’t tell you much beyond, “Huh, that’s interesting” if even that).
But wait, there’s more.
Playcount might be equally meaningless if — and this is the outcome I was hoping for — the listeners and playcounts were really close meaning a half-billion people tried the casserole but nobody went back for seconds.
I can already tell that Radiohead fans listen to a motherload of Radiohead music because, as it turns out, their playcount is almost a half-billion. So I wrote a little Python script to find the ratios for me which I could then use for sorting.
The top five still hold their places. Ozzy drops two places but that’s nowhere near as interesting as Saxon moving up 12 slots, Anvil … freaking Anvil … moving up 11 … Journey losing ten places and … and … Ronnie & the Prophets gaining nine because who can get enough of Dio singing doo-wop?
Let’s talk about Evil Stig (who gained +11) because I have some algorithms-in-progress to take care of this–and other–artists.
Once the app is all done, it’ll auto-magically put artists together like Dio, Ronnie & the Prophets, Ronnie & the Redcaps, Heaven & Hell, Elf, and the Electric Elves because Ronnie James Dio (actually, there are some songs for that artist specifically as well) is in all of them. Why not also add Black Sabbath and Rainbow you ask? Because he was only in those bands for some of their albums. The algorithm(s) part of the solution is relatively easy compared to the tedious, manual table-building and table-populating part. Ugh.
RJD (blue overlay in Figure 04) would move up at least a couple slots with his 6-8 bands and Joan Jett would move up at least six with her three.