Ted Nugent Is A Sweetheart

I know because I’ve met him. He was kind, cool, gentle, generous and awesome. I’ve known and am related to others who have met him and they feel the same. Maybe it’s because we’re all from Michigan. People are just nicer and more laid back there.

Also, as I’ve been reminded the last couple days while listening to some records I haven’t played in a long, long time — how much he rocks like nobody’s business when he’s on his game.

In honor of his upcoming August 20 show here in the Tampa Bay area, here are some stats from my little web app. I have one set of PHP scripts that daily and weekly grab data from the Spotify WebAPI. I track popularity scores for artists, albums, and tracks as well as followers for artists. Followers for artists is 99% boring because they just continue upward so that line chart on the left side of Figure 01 represents what most of them look like. Boring. I have another set of scripts written in Python that grab data from Last.fm using the MBIDs from MusicBrainz.org.

Last.fm can only grab one thing at a time which sucks hardcore. With Spotify, sometimes you’re limited to 20 or 50 things at a time but at least you can get 20-50 things at a time! The Python scripts take for-ev-er. All of that data then goes into various tables in a MySQL database.

The app’s … can I call it an app if it’s a web app? Or is it just a regular old site albeit a data-driven site? I call it PopRock because it started tracking only popularity on Spotify. The home page is a list of the artists I track — a total of 326 as of this moment.

Many were chosen because they were either nominated for induction to the Rock and Roll Hall of Fame, possibly inducted to the Rock and Roll Hall of Fame, or there are countless articles about the fact that they are continually snubbed or overlooked by the Rock Hall. I was curious about whether their popularity was affected by such announcements.

Figure 00: PopRock’s home page with artists sorted by Last.fm playcount. The default order is A-Z but users can sort by Popularity, Followers, Listeners, and Playcount. You’d be surprised how different the table appears when you change it.

Others were chosen because I’m interested in them, there is or was a movie coming out about them, or they might die soon and I am so pissed that I didn’t have David Bowie or Prince in when they died. I can’t believe I didn’t already have them in given how much I love each of them–especially Bowie.

All of the artists in Figure 00 are in Spotify and some also get data from Last.fm. I really mean only some — it’s just pure luck or coincidence that sorting the table by playcount to see which artists are “near” Ted Nugent brought nothing but artists for which I get data from both places. Originally, I was just going to screen grab five above and five below Nugent but I stretched it since going another 1 or 2 brought in both Joan Jett and her one-time bandmates, Evil Stig.

There is a growing list of artists with data only from Last.fm and I am in the process of making a page just for them. Integrating both data sets into every “feature” has proven not only sometimes difficult but it tends to break stuff for days — the damage sometimes compounded because I may have missed it and kept coding before I noticed it.

Writing the AJAX for the sorting was challenging enough without … changing what data is displayed and then changing my mind again.

Figure 01: The main or home page for this artist. A work in progress.

Here are some closeups. Neither of the Spotify-based line graphs give any real, useful information, but they were great for learning Javascript, jQuery, and D3 as well as PHP and Python. There’s still a lot of work to do as I want to make almost everything on the site interactive which will increase it’s usefulness a ton as well as the “fun factor.”

Figure 02: I really don’t like those default “pill boxes” in Bootstrap.

Artist and album cover art images come mostly from Spotify. For the few that are Last.fm-only come from some art archive something or other that rarely works so if I have an album from MB/LastFM that isn’t also in Spotify, I just hunt it down with a Google Image search because life is too short.

Nugent began with The Amboy Dukes, a band that was equally cool yet sounded nothing like his eventual solo work. Here are their stats.

Figure 03: And I need to do something about that dynamic title, don’t I?

Normally, the Spotify Followers graphs all look like Nugent’s below, no matter what the range is in the Y axis so I was surprised to see that jump for the Amboy Dukes in mid-April. Also, odd jumps like that usually happen Spotify-wide so I find this one particularly interesting because it didn’t appear anywhere else. What could possibly influence the Amboy Dukes?

Figure 04: Spotify Followers for Ted Nugent (left) and the Amboy Dukes (right) made with D3.js

I still haven’t quite decided on a solution for messes like this next one.

Figure 05: Spotify popularity for many of Nugent’s albums. Also made with D3.js

My greed for data is never satisfied so I grab data for every edition of every album. So, for artists with relatively few albums it gets far too wide — especially if I ever decide I want to make this mobile-friendly. If you look closely, you can see the last album is cut off — and I have the SVG width at something ridiculous like 2400px. I am considering the following (no pun intended):

  • Putting the SVG in a “scrollable” DIV
  • Making the chart vertical
  • Just purging many of the “duplicate” albums from my database
  • Hmm … just thought of this one … starting with a few and making the rest optional. The user can drag thumbnails from outside the graph and then it’s their problem if it get’s too wide.

Gosh. Darn. His music is so awesome. “Workin’ Hard, Playin’ Hard” is on now. I’m listening to my Wholesome, Calming Ted Nugent Mix playlist on Spotify.

Wholesome Calming Ted Nugent Mix playlist on Spotify
Figure 06: Click here or search for this playlist on Spotify

Speaking of songs, here are his most played albums according to Last.fm and his most “popular” albums according to Spotify.

Figure 07: Even little things like comparing these two charts are what I love about data.

I love finding out what’s the same and what’s different … whether it’s people, cultures, music charts, whatever. I wish the above two sources had an Insights blog like PornHub. Sex is great and all, but PornHub’s data blog — that’s what really turns me on.

Ted’s charts don’t contain any surprises for the most part. Of course the eponymous debut is #1 on both because it has Stranglehold, followed closely by Cat Scratch Fever (because it’s a masterpiece) then Free-for-All because, I mean, damn … he got a local (then) singer named Meat Loaf for, like, $2 when he was struggling at Motown Records. You’ve got “greatest hits” compilations through both. Hunt Music and Spirit of the Wild stay up there for one reason and one reason alone — the ethereal, magical, amazing, spiritual, ass-kicking, galaxy-rocking song that is “Fred Bear“. I doubt I am alone in being only able to name that one song from either album. Intensities In 10 Cities is a good album and all but it’s probably only there because of Wayne’s World.

Ooh, hold on … “You Make Me Feel Right At Home” is on … How is it that Frank Zappa and Ted Nugent are the only two rockers who use … I don’t know if it’s a xylophone or a marimba … but it’s just perfect.

I’ve never understood the love for his debut solo album. There. I’ve said it. I’m sorry.

I want to say I’m surprised that Love Grenade and Craveman are so close to the top but I can’t speak with any authority because I’ve never bothered to listen to either of them. You know what, I’ll bet they’re high up because people listen to them on streaming … nope, nope … they’re on the LastFM chart … which means it’s more likely those albums were paid for, right?

And now, as I promised earlier … the songs … okay, now, check this out …

Figure 08: Most popular Ted Nugent songs on Spotify

Normally, on a Spotify list, there are lots of duplicates because you have true music-lovers listening to actual albums but also a lot of people listening to whatever single they’ve heard from some compilation. Figure 08, however, looks like an actual ranking while the Last.fm chart looks like what I’d expect from Spotify.

Figure 09: Most played Ted Nugent songs according to Last.fm

All of that is just … whatever … you may, like me, be more curious about the Amboy Dukes. I won’t bother showing their album rankings because a) they didn’t have many and b) you know full well what #1 is as well as #2 and why. In that spirit, I’ve taken the liberty of crossing out the obvious tracks so we can look at the other, more exciting players on the chart.

Figure 10: Most popular Amboy Dukes tracks on Spotify

How is “Missionary Mary” so low? And where the heck is “Saint Philip’s Friend”? Where, I ask! It turns up here, as I’d expect (see above) on the Last.fm list.

Figure 11: Most played Amboy Dukes songs according to Last.fm

If you ever get the chance to talk to Mr. Nugent, conversation with him is more likely to sound like “Why a Carrot is More Orange Than an Orange” than “Wang Dang Sweet Poontang.” Seriously. He’s great.

Ted Nugent was my second concert. Bon Jovi was supposed to open for him but didn’t show.

Figure 12: Me wearing my Ted Nugent concert t-shirt from the Penetrator tour. I find it interesting that the album cover didn’t use the traditional logo but the shirt did. Wait a second … the album used the same logo as the Free-for-All album — I never noticed!

Until recently (then recently, not now recently) my hair was down to my shoulders. The burnout princess I went to the concert with was less than thrilled about my haircut. Truth be told, so was I. I took the Peter Criss solo album to the barber and said I wanted my hair to look like that. It didn’t.

Here is a playlist of his setlist for that night:

Spotify playlist based on Ted Nugent Setlist from April 27, 1984 in Detroit
Figure 13: Click here or search for this playlist on Spotify


I saw him again for the 1990/1991 Whiplash Bash. The Damn Yankees were in high rotation so it was a Ted Nugent/Damn Yankees concert which is the closest I’ve ever been to a Styx concert.

One of the few memories I have of my father is him buying me Nugent by Ted Nugent at … Kmart or something … and listening to it in his car. It was one of those depressing … visits at one of his depressing apartments after my parents divorced.

I think I’ll go see him in August. That, I think, would rock.


Unexpected Data Cleaning

I’ve had to get creative with writing SQL queries to start the process of merging my Spotify data and my LastFM data. At first I thought it was going rather well if not very fast. I was definitely exercising my brain and strengthening my SQL and algorithm skills.

I was quite pleased and relieved upon realizing all the releases in a LastFM release-group–as well as all common recordings across them–shared the same Listeners and Playcount numbers.

What does all that gibberish mean? Check out my first post about Using the Last.fm and MusicBrainz APIs.

Think of an album. Any album. I’ll think of 13 by Black Sabbath because that will come up here in a minute. That album is released in multiple countries, even in a single country there are multiple versions of an album such as different covers, deluxe versions with bonus tracks, after a few years the remastered version, and so on. Each unique version is a “release” in the 13 “release-group.” What you and I would call “tracks,” MusicBrainz calls “Recordings.”

Table listing data from Last FM about the Black Sabbath album thirteen.
Figure 01: Data porn in the form of the “13” Release Group.

Anyway, all nineteen versions of 13 (Figure 01) are really all the same album with mostly all the same songs so, thankfully, LastFM gives them the same standard MBID and numbers. I’m relieved because they could have been pricks (as they’re known to be) and said more people listen to the version of “God Is Dead?” on vinyl than on the digital copy from iTunes but–again, thank God–they don’t. See what I did there? With the “thank God”? I am so clever.

Just in case you’re as aroused as I am looking at all the yummy goodness from MusicBrainz in Figure 01, here’s another screenshot.

Figure 02: Just look at it! My imagination is running wild!

At first, I thought it would be messy kinda like Spotify, so I had Python grab every MB release for which LastFM had data and store it all in a JSON file (so I could review it and plan for my other data science and data visualization needs & wants).

Figure 03: A “valid” release, for me, is one for which LastFM has data. There’s no reason to keep that data and waste time checking all of them every time.

I wanted to merge or at least somehow “link” the Popularity and Followers data from Spotify with the Listeners and Playcount data from LastFM for something resembling easy access. After deciding I only needed one set of stats per album and/or song from each release group, I told PHP to just grab the first release from each release-group for my database. 

Figure 04: That argument in the if statement came after a lot of time and frustration caused by the fact that I didn’t know some release-groups didn’t have any releases. I know.

I added columns to my current albums table and created an albumsMB table with plans to merge them. To make that easier, I chose to make some temporary, redundant columns rather than convoluted JOINs and sub-queries. Then I made copies of those tables and ran some tests on those temporary copies. I have far too much data I love and am attached to — I am not going to risk losing it no matter how trivial the task is on which I’m working.

I played with two tables. The table of album info from MusicBrainz (below) and my existing table of album info from Spotify.

Figure 05: Wherever there’s a Black Sabbath MBID in my albumsMB (album info from MusicBrainz) table, I added Black Sabbath’s Spotify ID. Easy.

I wondered–a lot–whether I really needed more columns from the data like “country”, “disambiguation,” etc. but all my SQL test queries worked so well so easily, I eagerly, perhaps hastily went in the “opposite” direction. I thought it was enough to tell PHP that wherever the album title and Spotify artist ID from the matched, add the album’s MBID to my existing albums table.

I was SO excited. I was finally going to have complete charts–with quantifiable data–for Black Sabbath that included all the Tony Martin era albums Spotify lacks. I was totally going to take screenshots and send them to Tony and he was going to be so grateful and we’d be best buds and I could move onto actual new features and stuff since my data was massaged and merged and yay!

I switched browser tabs to bask in my victory.

Figure 06: This page has things I’ll fix and temporary stuff I’ll remove once everything works.

I didn’t scroll down so I didn’t see this (Figure 06) whole thing. I didn’t notice there were actually three albums with LastFM data. I thought I’d not yet added code to the query that should populate the Listeners and Playcount columns but when I checked, I saw it was there and should be working. I don’t know which I saw first — those three rows with LastFM data or … this …

Figure 07: The albums table of Spotify album info. Only three rows received a MBID.

I was more confused than frustrated already by all the missing data in the albumMBID column when I noticed the two tables also contained very different albums.

Later: I wonder how the releases are ordered … but I can’t control the order they come in … regardless of how they are in their home database … I may have to use some surgical precision … with certain properties.

I should have anticipated that Tony Martin’s album MBIDs wouldn’t have a place to go but I also never noticed most of those Spotify albums include “(Remastered Edition)” in the title which meant most of the titles didn’t match. I wasn’t happy about that … that potentially meant some exhausting work with RegEx. I thought I might get off easy if I could change some existing columns to use FullText but I immediately thought of potential problems.

I half-heartedly started copying MBIDs and pasting them into the empty fields but then deleted them as I thought of … not only did I not want to do this manually for all my current (and future!) MB albums but what if I added columns later or found a more … accurate-ish, clean-ish solution. Which is what I started doing earlier and what I’m going to do now. Add more columns for the properties I mentioned earlier. One of the JSON files to which I referred when planning all this had no values for all the keys I thought of using. It looks like the most recent Black Sabbath data (from yesterday, FWIW) has values for most of them. I’ll need to add equivalent columns to the Spotify albums table, too.

In the back of my mind, I kept wondering why my Spotify albums list was so short. I keep adding and removing values from the “type” and “group” key in the PHP file that fetches Spotify album data. I’ll have to do that again so I can have more albums from each source so I can, hopefully, have more matches.

As I wrote this post and went to MusicBrainz to take screenshots, I saw this …

Figure 08: So THERE are the albums that Spotify had but my MB/LFM data was missing!

Later, I also noticed the “Type: Album” bit in Figure 02.

I now know I have to request “Album + Compilation” and “Album + Live” to get, for example, Reunion. But … why don’t I have Vol 4 from MusicBrainz? It can’t be possible there were no releases from the Vol 4 release-group with data at LastFM … right?

Well, time to get knee and elbow deep with even more data about each release-group and release …

Yes, I’m aroused by that. 😉

P.S. I wonder if … and hope that … MusicBrainz and/or LastFM have the CD I once found at a truck stop … it was called something like “The Essential Black Sabbath” or “Black Sabbath’s Greatest” and it was exclusively Tony Martin tracks. I’d love find that and someday get it autographed just because it’s so … peculiar.

Update: Finding that Tony Martin Compilation

I have no idea if is even any of those listed in Figure 08, so …

Figure 09: I love how precise MB users are — that someone made sure to add “featuring Tony Iommi” to Seventh Star.

Step #1 Click any album in their discography (Figure 09) between 87-95 except Dehumanizer. Poor Tony, man.

Figure 10: Headless Cross release group

Step #2 Click any release (Figure 10).

Figure 11: Dang, dudes! They gotta say “ex-Black Sabbath vocalist”? I wonder if Dio suffered the same fate. 

Step #3 Click Tony Martin’s name in the credits (Figure 11).

Figure 12: Tony Martin’s discography

Crap. I was hoping it would be listed right here (Figure 12) but, I suppose this makes more sense.

Step #4 Click “Show all release groups instead” (Figure 12) or anything else I need to.

Crap #2. I’m actually going to have to go through all of those compilations …

I opened each compilation 1996 and later (that I knew wasn’t specifically Ozzy or Dio) in another tab.

*Sigh* Each tab is a release group so requires at least one more click.

I checked eight of those candidates:

  • The Sabbath Stones
    Six instrumentals, seven Tony Martin tracks, and one apiece by Dio, Ian Gillan, and Glenn Hughes. No Ozzy.
  • Greatest Hits
    Ozzy & Dio
  • Ozzy is the only vocalist featured on:
    • The Ultimate Collection
    • Forever
    • Greatest Hits
    • The Collection
    • Rock Giants
  • Rock Champions is all Tony Martin!

I checked a few albums and Dio is never referred to as “ex-vocalist”.

I feel it’s also worth mentioning The Best of Black Sabbath which is an unusually respectable–albeit unbalanced–compilation as these things go. First of all–awesome cover.

  • 28 songs by Ozzy
  • 2 by Dio
  • 1 awful song by Ian Gillan
  • 1 instrumental

Not a single Tony Martin song.


How Do Rock and Roll Hall of Fame Nominations Affect Popularity On Spotify?

Shout out to: Malcolm MacLean‘s book, Data Driven Documents D3.js Tips and Tricks v4 which made these really super easy.

See, this here is the endgame — the kind of thing I learned D3 and coding so that I could create. This is what makes programming fun. The following are based on data I collect using the Spotify API.

Here are the inductees in the class of 2019 for the Rock and Roll Hall of Fame.

I was in a hurry. I’ll tweak the math for legend placement later.

Here are those who were nominated but not inducted.

I was in a hurry. I’ll put the names in a container later. Maybe.

Nominations were released October 9. You can tell I wasn’t tracking a few of these artists (John Prine, Todd Rundgren, Roxy Music, Stevie Nicks, Janet Jackson) until that date because their lines don’t start until then. There seems to be a slight gain for each (maybe) but that’s total “correlation not causation” because these are Spotify’s very relative and ever-fluctuating “popularity” scores not playcounts.

Suppose the Beatles are higher than all these artists and John Lennon says something stupid like, say, “We’re more popular than Jesus” — the Beatles popularity goes down and since it’s a zero-sum game with Spotify Popularity, everyone who doesn’t put their foot in their mouth gets a bump.

Which, looking at the above graphs, begs the question, “What happened at the end of June to make everyone crash and in mid-July to cause everyone’s climb back up?”

Another, more intriguing point, is early April when the MC5, Radiohead, and the Cure all spiked up and right back down.

Spotify doesn’t answer any of my questions so we’ll never know.

Even if nominations don’t affect anything significantly, we can watch and see if the induction has any significant effect.

Florida Cannibal Corpse Man

Detroit is known for many, many, countless great things — not the least of which is giving the world great music over, and over again.

You’re welcome.

Florida, on the other hand, is known for … 99.999% ridiculous crap and Tom Petty.

One such pride and joy of the Tampa area is Cannibal Corpse. In honor of guitarist Patrick O’Brien’s arrest today, here’s some data viz we’ll compare to any jumps in response to his shenanigans.

Bar chart measuring Cannibal Corpse albums popularity
Popularity of Cannibal Corpse albums per the Spotify API as of Dec 9, 2018. (Click for larger)

For those of you unfamiliar with the band, here are their most popular tracks on Spotify (as of Dec. 9, 2018). “Addicted to Vaginal Skin” is the one that’s stuck with me in the twenty years since I found one of their cassettes in the garbage at college.

Album Title
Track Title
Track Popularity


Tomb Of The Mutilated Hammer Smashed Face 51
Torture Scourge of Iron 47
Evisceration Plague Evisceration Plague 45
Tomb Of The Mutilated I Cum Blood 44
Red Before Black Code of the Slashers 43
A Skeletal Domain Kill or Become 41
The Bleeding Stripped, Raped, And Strangled 40
Red Before Black Only One Will Die 40
Red Before Black Red Before Black 39
Kill Make Them Suffer 38
Tomb Of The Mutilated Addicted to Vaginal Skin 35
Red Before Black Shedding My Human Skin 35
Red Before Black Remaimed 33
Evisceration Plague Priests Of Sodom 33
The Bleeding Fucked With A Knife 33

Normally, it really, really bothers me that Spotify only supplies a “popularity” score — a relative number ranking all artists against each other — instead of a more quantifiable and useful “playcount” number but, in this case, it makes for some fun comparisons.

Screen Shot 2018-12-11 at 9.16.19 PM.png

As of yesterday, Cannibal Corpse’s popularity on Spotify is 55. That’s down from 58 one year ago which was their high since I’ve been collecting data on them. It should be noted, though, it is up from their low of 50 for the last three weeks of July 2018.

As I said, by itself those numbers are meaningless until we put them in context. While 55 isn’t as high as Insane Clown Posse‘s (a Detroit export–you’re welcome) current popularity of 58 or Slayer‘s 68, it is definitely higher than … well, we can only assume they are less talented artists judging by poor popularity scores such as

  • Lindsey Buckingham (51)
  • Venom at (48)
  • Gwar at (47)
  • Stryper (46)
  • King’s X (39)
  • Was (not Was) (34)

For further comparison, Queen (97) recently overtook Eminem (95) as the highest popularity score in my database (I don’t collect data on everyone–just those I find interesting).

Pop quiz: Who, according to Spotify, is the more genre-hopping artist of the following?

  • Cannibal Corpse
  • David Bowie
  • Bob Dylan
  • Bruce Springsteen
  • Cyndi Lauper
  • Elton John
  • Jack White

If you guessed either Cannibal Corpse or Bob Dylan, you’d be correct! They are tied at nine genres each. Elton John only crosses over into five and David Bowie into seven. What are these diverse genres Cannibal Corpse finds themselves in?

  • death metal
  • alternative metal
  • brutal death metal
  • speed metal
  • deathgrind
  • groove metal
  • metal
  • nu metal
  • technical death metal

Yeah, I know. I learned this yesterday.

My little app with the clever name Pop Rock (get it?) was built with Javascript, D3, PHP, MySQL, and love.

Dec 19 update: Howard Altman’s story in today’s paper shows the events were far weirder than initially known. As you can see below, after one week, CC‘s popularity spiked a single point for a single day. I’ve never seen a spike and drop like that happen. My guts (see what I did there?) tell me it’s unrelated, however.


Data Science Isn’t Always Sexy and Glamorous

Sometimes data science is a bunch of debugging and fact-checking.

Bit o’ trivia about me: I got into this because I wanted to start using open data and APIs instead of constantly fact-checking frequently inaccurate data from my co-workers. Now, I find I have to fact-check my data. Not sexy. Not sexy at all.

I’ve wanted to write this post for a long time but, until now, I felt it would seem like mere whining … or conspiracy ranting (you’ll see).

Gathering complete, comprehensive election results seems an impossible task. It’s almost (get ready) as if somebodyThey … don’t want us to have them (imagine my voice in any shrill tone you like).

It should be easier. Much easier.

I promise I won’t complain about every little thing because I’ve complained about some of these things before — such as

  • The FEC provides Excel files for some elections but only PDFs for others
  • The FEC API doesn’t provide access to election results at all
  • The aforementioned results aren’t available on their website for months
  • Politico has results in real time and even updates them for a couple days after the election but the page is then like an operating system update or rendering video … it stays at 99% complete forever. Even now, as of December 10, 2018, Alaska shows only 99.5% precincts reporting.

Last night, I opened VS Code to continue adding features to Election Insights (the web app formerly known as prezPlayPro) when I noticed a few things were somehow broken since my triumphant post of November 23 — not only did some results change but some maps were downright broken.

Long Story Short: I now test everything using a private or Incognito window so I can be at least a wee bit more sure I’m looking at the latest code. Results that I expected to change after a fix made 2-3 weeks ago finally showed up. So, nothing was broken or causing incorrect results but I only know the “new” results are accurate because I did some digging in several piles of data to confirm … digging which should have been easier.

I had two potential problems I needed to investigate. Two questions needed answering:

  • Did Evan McMullin really beat Darrell Castle in a buttload of states?
  • Why do I have two candidates (Darrell Castle and Emidio Soltysik) affiliated with the U.S. Taxpayers party in Michigan?

In my previous, Final 2016 Presidential Election Maps, post, I realized I hadn’t included Evan McMullin in my arrays of “right-leaning” candidates. Much to my surprise, adding him to those arrays didn’t change any results (or so I thought). Last night, when I saw that including him may have drastically changed the results, I realized one of two things was true — either my code was broken or my database contained mistakes.

I chose to look into Texas because when I moved my hand, that’s where my cursor landed, showing me McMullin. My results are taken from the PDF from the FEC but, FORTUNATELY, I didn’t go directly to that PDF to confirm results. I also wanted to check party affiliations which I got from Ballotpedia (whom I’ve whined about previously for other issues even before the inaccuracy I just found). Otherwise, I wouldn’t have found some of the groovy things I did.

So, first, I went to the Ballotpedia page for Michigan’s 2016 presidential election results. Much to my relief, the mistake was theirs.

I took all my party affiliations from their Results tables which, at least in this case, differs from the list above.

When I first started this project, I tried using Python‘s Beautiful Soup to grab info like that in the above screenshot from Politico because they conveniently listed every state on a single page. Unfortunately, the code is filled with inconsistencies and invisible crap neither I nor Beautiful Soup could beat into submission. Also, if memory serves, candidates’ names were spelled differently on different state ballots. <– That’s infuriating fact #4,987 on the list.

So I just did some major cutting and pasting to fifty pages I saved from Ballotpedia which sucked in it’s own way because you can’t right-click on their US map to open them in separate tabs — you have to click each one and, after saving the state page, click the Back button to get back to the map.

Before I noticed the Ballotpedia candidate list contained different parties than the results table, I followed the link to their data source (Michigan‘s Secretary of State or, as Ballotpedia calls it, “Department of State”) but when I clicked it, got a 404. Several of the source links at Ballotpedia have the same result but I don’t know whether I should be frustrated with Ballotpedia for having broken links or, as I’d thought previously, frustrated with those states for not keeping their results pages up. My FEC results PDF lists parties for each candidate (but not, much to my chagrin, by state). There I found Soltysik listed as Natural Law Party (which is still kinda conservative, if my recollection is correct) and Socialist Party USA (like the Beach Boys song).

Not yet noticing the mistake in the screenshot above, I set my party affiliation problem aside for the moment and went to Ballotpedia’s Texas page so I could confirm my results (from the FEC) for Castle/McMullin.


Ballotpedia doesn’t even list McMullin as a candidate in Texas but does list 51,261 write-in votes. Ever the optimist, I clicked the link for Texas Secretary of State.

And, as it turns out, McMullin wallops Castle in Texas.

Black gold! Texas tea! Comprehensive election results, that is!

Note that most of those 51k+ write-in votes are for a single candidate. I think that’s rather significant. If I were the type to post election results, I might consider including that bit of information. Of course, Ballotpedia is probably in the pocket of the Commission On Presidential Debates (who fit nicely in the pocket of Big Insurance who are run by the Illuminati).

Now I was curious if Politico limited their results like Ballotpedia.  I had to go there anyway to see what party affiliation they had for Soltysik anway, so … after finding Soltysik was accurately listed as NLP in Michigan, I saw Politico‘s Texas results were wanting as much as Ballotpedia’s.

These are Politico’s “detailed” results.

Now I was grateful I couldn’t get Beautiful Soup working to my satisfaction with Politico. I’d have missed out on a bunch of candidates!

I still have much digging to do because far too many of my candidates have “null” for party affiliation — not to mention I now know I must fact-check whatever I find. Getting data directly from each state would be best, of course, but since Ballotpedia’s links don’t go anywhere, that won’t be as easy as I’d like.

Data Journalism Roxor My Soxor

I think I’ve decided on my niche and what RoxorSoxor is to be.

Now going through Doing Journalism with Data: First Steps, Skills and Tools and the Google News Initiative‘s Fundamentals course while waiting to find out if I am among the #GoogleUdacityScholars selected for Phase 2 of the GrowWithGoogle scholarship.

Loving both.

While working on an exercise in lesson 11 of the GWG Challenge course, I learned that the Tampa Bay Times used to be the St. Petersburg Times and they are the ones who started Politifact!

The St. Pete Times bit is important to me because when I was a young teenager, I had the opportunity of getting to know an investigative reporter and ask him over the course of several conversations about what it would take for me to get into journalism. He recommended the St. Pete Times as a great paper to read and work for. It was led for decades by one Mr. Poynter, followed by his son who left all of his ownership stock in the paper to start the Poynter Institute.

They have a history of excellent journalism and they only got better over time. Even as they almost ceased to exist, they racked up a few Pulitzers and, while being a bit slow on the online uptake, they not only did that whole Politifact thing but kinda showed the world was data journalism was.

They really kick ass and … might make me feel like coming to this area is my destiny and not a big, fat mistake.

Seeds of an Algorithm

I’ve been wondering how I would ever come up with an algorithm for “combining” the different scores of the same song on different albums. I think an idea was just born out of the blue while sitting here watching West Wing on Netflix and looking at Alice Cooper‘s tracks data. This seed is somewhat related to how I was looking at the Bloodgood albums and tracks earlier today. The algorithm I’m currently juggling in my head is nothing like I’d been imagining the last few months.