Sometimes data science is a bunch of debugging and fact-checking.
Bit o’ trivia about me: I got into this because I wanted to start using open data and APIs instead of constantly fact-checking frequently inaccurate data from my co-workers. Now, I find I have to fact-check my data. Not sexy. Not sexy at all.
I’ve wanted to write this post for a long time but, until now, I felt it would seem like mere whining … or conspiracy ranting (you’ll see).
Gathering complete, comprehensive election results seems an impossible task. It’s almost (get ready) as if somebody … They … don’t want us to have them (imagine my voice in any shrill tone you like).
It should be easier. Much easier.
I promise I won’t complain about every little thing because I’ve complained about some of these things before — such as
- The FEC provides Excel files for some elections but only PDFs for others
- The FEC API doesn’t provide access to election results at all
- The aforementioned results aren’t available on their website for months
- Politico has results in real time and even updates them for a couple days after the election but the page is then like an operating system update or rendering video … it stays at 99% complete forever. Even now, as of December 10, 2018, Alaska shows only 99.5% precincts reporting.
Last night, I opened VS Code to continue adding features to Election Insights (the web app formerly known as prezPlayPro) when I noticed a few things were somehow broken since my triumphant post of November 23 — not only did some results change but some maps were downright broken.
Long Story Short: I now test everything using a private or Incognito window so I can be at least a wee bit more sure I’m looking at the latest code. Results that I expected to change after a fix made 2-3 weeks ago finally showed up. So, nothing was broken or causing incorrect results but I only know the “new” results are accurate because I did some digging in several piles of data to confirm … digging which should have been easier.
I had two potential problems I needed to investigate. Two questions needed answering:
- Did Evan McMullin really beat Darrell Castle in a buttload of states?
- Why do I have two candidates (Darrell Castle and Emidio Soltysik) affiliated with the U.S. Taxpayers party in Michigan?
In my previous, Final 2016 Presidential Election Maps, post, I realized I hadn’t included Evan McMullin in my arrays of “right-leaning” candidates. Much to my surprise, adding him to those arrays didn’t change any results (or so I thought). Last night, when I saw that including him may have drastically changed the results, I realized one of two things was true — either my code was broken or my database contained mistakes.
I chose to look into Texas because when I moved my hand, that’s where my cursor landed, showing me McMullin. My results are taken from the PDF from the FEC but, FORTUNATELY, I didn’t go directly to that PDF to confirm results. I also wanted to check party affiliations which I got from Ballotpedia (whom I’ve whined about previously for other issues even before the inaccuracy I just found). Otherwise, I wouldn’t have found some of the groovy things I did.
So, first, I went to the Ballotpedia page for Michigan’s 2016 presidential election results. Much to my relief, the mistake was theirs.
When I first started this project, I tried using Python‘s Beautiful Soup to grab info like that in the above screenshot from Politico because they conveniently listed every state on a single page. Unfortunately, the code is filled with inconsistencies and invisible crap neither I nor Beautiful Soup could beat into submission. Also, if memory serves, candidates’ names were spelled differently on different state ballots. <– That’s infuriating fact #4,987 on the list.
So I just did some major cutting and pasting to fifty pages I saved from Ballotpedia which sucked in it’s own way because you can’t right-click on their US map to open them in separate tabs — you have to click each one and, after saving the state page, click the Back button to get back to the map.
Before I noticed the Ballotpedia candidate list contained different parties than the results table, I followed the link to their data source (Michigan‘s Secretary of State or, as Ballotpedia calls it, “Department of State”) but when I clicked it, got a 404. Several of the source links at Ballotpedia have the same result but I don’t know whether I should be frustrated with Ballotpedia for having broken links or, as I’d thought previously, frustrated with those states for not keeping their results pages up. My FEC results PDF lists parties for each candidate (but not, much to my chagrin, by state). There I found Soltysik listed as Natural Law Party (which is still kinda conservative, if my recollection is correct) and Socialist Party USA (like the Beach Boys song).
Not yet noticing the mistake in the screenshot above, I set my party affiliation problem aside for the moment and went to Ballotpedia’s Texas page so I could confirm my results (from the FEC) for Castle/McMullin.
Ballotpedia doesn’t even list McMullin as a candidate in Texas but does list 51,261 write-in votes. Ever the optimist, I clicked the link for Texas Secretary of State.
Black gold! Texas tea! Comprehensive election results, that is!
Note that most of those 51k+ write-in votes are for a single candidate. I think that’s rather significant. If I were the type to post election results, I might consider including that bit of information. Of course, Ballotpedia is probably in the pocket of the Commission On Presidential Debates (who fit nicely in the pocket of Big Insurance who are run by the Illuminati).
Now I was curious if Politico limited their results like Ballotpedia. I had to go there anyway to see what party affiliation they had for Soltysik anway, so … after finding Soltysik was accurately listed as NLP in Michigan, I saw Politico‘s Texas results were wanting as much as Ballotpedia’s.
Now I was grateful I couldn’t get Beautiful Soup working to my satisfaction with Politico. I’d have missed out on a bunch of candidates!
I still have much digging to do because far too many of my candidates have “null” for party affiliation — not to mention I now know I must fact-check whatever I find. Getting data directly from each state would be best, of course, but since Ballotpedia’s links don’t go anywhere, that won’t be as easy as I’d like.