Data Science Isn’t Always Sexy and Glamorous

Sometimes data science is a bunch of debugging and fact-checking.

Bit o’ trivia about me: I got into this because I wanted to start using open data and APIs instead of constantly fact-checking frequently inaccurate data from my co-workers. Now, I find I have to fact-check my data. Not sexy. Not sexy at all.

I’ve wanted to write this post for a long time but, until now, I felt it would seem like mere whining … or conspiracy ranting (you’ll see).

Gathering complete, comprehensive election results seems an impossible task. It’s almost (get ready) as if somebodyThey … don’t want us to have them (imagine my voice in any shrill tone you like).

It should be easier. Much easier.

I promise I won’t complain about every little thing because I’ve complained about some of these things before — such as

  • The FEC provides Excel files for some elections but only PDFs for others
  • The FEC API doesn’t provide access to election results at all
  • The aforementioned results aren’t available on their website for months
  • Politico has results in real time and even updates them for a couple days after the election but the page is then like an operating system update or rendering video … it stays at 99% complete forever. Even now, as of December 10, 2018, Alaska shows only 99.5% precincts reporting.

Last night, I opened VS Code to continue adding features to Election Insights (the web app formerly known as prezPlayPro) when I noticed a few things were somehow broken since my triumphant post of November 23 — not only did some results change but some maps were downright broken.

Long Story Short: I now test everything using a private or Incognito window so I can be at least a wee bit more sure I’m looking at the latest code. Results that I expected to change after a fix made 2-3 weeks ago finally showed up. So, nothing was broken or causing incorrect results but I only know the “new” results are accurate because I did some digging in several piles of data to confirm … digging which should have been easier.

I had two potential problems I needed to investigate. Two questions needed answering:

  • Did Evan McMullin really beat Darrell Castle in a buttload of states?
  • Why do I have two candidates (Darrell Castle and Emidio Soltysik) affiliated with the U.S. Taxpayers party in Michigan?

In my previous, Final 2016 Presidential Election Maps, post, I realized I hadn’t included Evan McMullin in my arrays of “right-leaning” candidates. Much to my surprise, adding him to those arrays didn’t change any results (or so I thought). Last night, when I saw that including him may have drastically changed the results, I realized one of two things was true — either my code was broken or my database contained mistakes.

I chose to look into Texas because when I moved my hand, that’s where my cursor landed, showing me McMullin. My results are taken from the PDF from the FEC but, FORTUNATELY, I didn’t go directly to that PDF to confirm results. I also wanted to check party affiliations which I got from Ballotpedia (whom I’ve whined about previously for other issues even before the inaccuracy I just found). Otherwise, I wouldn’t have found some of the groovy things I did.

So, first, I went to the Ballotpedia page for Michigan’s 2016 presidential election results. Much to my relief, the mistake was theirs.

ballotpediaMichigan2016.png
I took all my party affiliations from their Results tables which, at least in this case, differs from the list above.

When I first started this project, I tried using Python‘s Beautiful Soup to grab info like that in the above screenshot from Politico because they conveniently listed every state on a single page. Unfortunately, the code is filled with inconsistencies and invisible crap neither I nor Beautiful Soup could beat into submission. Also, if memory serves, candidates’ names were spelled differently on different state ballots. <– That’s infuriating fact #4,987 on the list.

So I just did some major cutting and pasting to fifty pages I saved from Ballotpedia which sucked in it’s own way because you can’t right-click on their US map to open them in separate tabs — you have to click each one and, after saving the state page, click the Back button to get back to the map.

Before I noticed the Ballotpedia candidate list contained different parties than the results table, I followed the link to their data source (Michigan‘s Secretary of State or, as Ballotpedia calls it, “Department of State”) but when I clicked it, got a 404. Several of the source links at Ballotpedia have the same result but I don’t know whether I should be frustrated with Ballotpedia for having broken links or, as I’d thought previously, frustrated with those states for not keeping their results pages up. My FEC results PDF lists parties for each candidate (but not, much to my chagrin, by state). There I found Soltysik listed as Natural Law Party (which is still kinda conservative, if my recollection is correct) and Socialist Party USA (like the Beach Boys song).

Not yet noticing the mistake in the screenshot above, I set my party affiliation problem aside for the moment and went to Ballotpedia’s Texas page so I could confirm my results (from the FEC) for Castle/McMullin.

ballotpediaTexas2016.png

Ballotpedia doesn’t even list McMullin as a candidate in Texas but does list 51,261 write-in votes. Ever the optimist, I clicked the link for Texas Secretary of State.

TexasSoS2016.png
And, as it turns out, McMullin wallops Castle in Texas.

Black gold! Texas tea! Comprehensive election results, that is!

Note that most of those 51k+ write-in votes are for a single candidate. I think that’s rather significant. If I were the type to post election results, I might consider including that bit of information. Of course, Ballotpedia is probably in the pocket of the Commission On Presidential Debates (who fit nicely in the pocket of Big Insurance who are run by the Illuminati).

Now I was curious if Politico limited their results like Ballotpedia.  I had to go there anyway to see what party affiliation they had for Soltysik anway, so … after finding Soltysik was accurately listed as NLP in Michigan, I saw Politico‘s Texas results were wanting as much as Ballotpedia’s.

TexasPolitico2016.png
These are Politico’s “detailed” results.

Now I was grateful I couldn’t get Beautiful Soup working to my satisfaction with Politico. I’d have missed out on a bunch of candidates!

I still have much digging to do because far too many of my candidates have “null” for party affiliation — not to mention I now know I must fact-check whatever I find. Getting data directly from each state would be best, of course, but since Ballotpedia’s links don’t go anywhere, that won’t be as easy as I’d like.

Advertisements

Prez Play Pro Progress

Thus far, if I spend this much time fixing stuff, it’s really frustrating because I’m not learning anything, I’m just trying to figure out what’s broken and why — but it wasn’t my own mistake or lack of knowledge.

This time, however, it is TOTALLY my lack of knowledge so each time I hid a bump, I learn something. Each problem I solve — I learn how and why what I did wasn’t working and how to do it correctly!

A lot of my work super-recently has been learning and using Javascript‘s map, reduce, and filter methods. I was depending on SQL queries (my database being MySQL, if you’re curious … though I was using MariaDB before changing hosting plans) to get the exact results and data visualization I wanted but, for multiple reasons, I’d prefer getting all the results from my database and manipulate the data client-side with Javascript.

Much of what follows is actually from an email I just sent to the American Nazi Party replying to an answer they sent to a question I asked a couple weeks ago. Say what you want about nazis, whenever I’ve asked a question, they’ve sent a polite and meaningful answer. Not all candidates/parties are responsive, let alone courteous. Anyway, I was explaining why I asked my questions and what I was doing.

So, here are my latest screenshots along with what is still in progress, etc.

If only what I consider “right” candidates run — excluding Trump because, of course, he’d win every state, then Gary Johnson wins every state. If we exclude him as well, Darrell Castle (usually Constitution Party) wins all states in which he runs except Arkansas where Jim Hedges wins. The gray states have no conservative/right candidates.

Screen Shot 2018-11-17 at 10.17.24 AM.png

Whatever happened to the Natural Law Party and Reform Party? Maybe the former are all write-in candidates (Darn it! I need to get write-in data!). I know the Reform Party was only on a couple state ballots in the 2012 election — not only that but with two different candidates. I read somewhere at least one state Reform Party nominated a different candidate because, despite how conservative Andre Barnett was (and still is, I assume), they didn’t like that he was (and still is, I suppose) black.

If only “left” or even just socialist candidates run, Jill Stein wins every state except Nevada, South Dakota, and Oklahoma (off the top of my head I think it’s simply because no “left” candidates besides Clinton were on the ballot there). If we only include socialists except Jill Stein, these are the results:

Screen Shot 2018-11-17 at 10.24.07 AM.png

Michael Maturen looks like he wins the most states (I don’t have electoral college votes in the code yet) and there are five tied in second place for number of states. Again, gray means there are either no socialist candidates in those states (except Stein) or, if there are — I need to clean up my data to make sure — there are candidates but none received any votes (which I find difficult to believe).

Once I get these all working, I’ll start adding previous elections and, hopefully, the next one won’t take two years to put together (in my own defense, most of that two years was the learning curve).

Semicolons Suck

After days and days and days of debugging and rewriting code … well, it wasn’t a missing or misplaced semicolon … but just as bad. Anyway … here is the Jill Stein landslide in the 2016 Presidential election (if the eleven socialist candidates were the only candidates):

Screen Shot 2018-10-18 at 9.51.39 PM.png

The gray states had zero votes for any (there may have been none — I don’t know off the top of my head) socialist candidates.

States Who Gained and Lost Electoral College Votes in the Last 30 Years

While working on a data visualization and web app project about presidential elections, I noticed that several states have either gained or lost votes (based on population) in the Electoral College with each census. Some states stayed the same throughout none of them gained-then-lost nor lost-then-gained.

States That Gained Votes Twice

  • Arizona
  • Florida
  • Georgia
  • Nevada
  • Texas

States That Lost Votes Twice

  • Illinois
  • Michigan
  • New York
  • Ohio
  • Pennsylvania

States That Stayed the Same

  • Alabama
  • Alaska
  • Arkansas
  • Hawaii
  • Idaho
  • Kansas
  • Kentucky
  • Maine
  • Maryland
  • Minnesota
  • Montana
  • Nebraska
  • New Hampshire
  • New Mexico
  • North Dakota
  • Oregon
  • Rhode Island
  • South Dakota
  • Tennessee
  • Vermont
  • Virginia
  • West Virginia
  • Wyoming

Others

  • California, Colorado, North Carolina each -1 vote in 2004
  • Connecticut, Indiana, Mississippi, Oklahoma, Wisconsin each -1 vote in 2004
  • South Carolina, Utah, Washington each +1 vote in 2012
  • Iowa, Louisiana, Massachusetts, Missouri, New Jersey each +1 vote in 2012

ecExcel