Data Science Isn’t Always Sexy and Glamorous

Sometimes data science is a bunch of debugging and fact-checking.

Bit o’ trivia about me: I got into this because I wanted to start using open data and APIs instead of constantly fact-checking frequently inaccurate data from my co-workers. Now, I find I have to fact-check my data. Not sexy. Not sexy at all.

I’ve wanted to write this post for a long time but, until now, I felt it would seem like mere whining … or conspiracy ranting (you’ll see).

Gathering complete, comprehensive election results seems an impossible task. It’s almost (get ready) as if somebodyThey … don’t want us to have them (imagine my voice in any shrill tone you like).

It should be easier. Much easier.

I promise I won’t complain about every little thing because I’ve complained about some of these things before — such as

  • The FEC provides Excel files for some elections but only PDFs for others
  • The FEC API doesn’t provide access to election results at all
  • The aforementioned results aren’t available on their website for months
  • Politico has results in real time and even updates them for a couple days after the election but the page is then like an operating system update or rendering video … it stays at 99% complete forever. Even now, as of December 10, 2018, Alaska shows only 99.5% precincts reporting.

Last night, I opened VS Code to continue adding features to Election Insights (the web app formerly known as prezPlayPro) when I noticed a few things were somehow broken since my triumphant post of November 23 — not only did some results change but some maps were downright broken.

Long Story Short: I now test everything using a private or Incognito window so I can be at least a wee bit more sure I’m looking at the latest code. Results that I expected to change after a fix made 2-3 weeks ago finally showed up. So, nothing was broken or causing incorrect results but I only know the “new” results are accurate because I did some digging in several piles of data to confirm … digging which should have been easier.

I had two potential problems I needed to investigate. Two questions needed answering:

  • Did Evan McMullin really beat Darrell Castle in a buttload of states?
  • Why do I have two candidates (Darrell Castle and Emidio Soltysik) affiliated with the U.S. Taxpayers party in Michigan?

In my previous, Final 2016 Presidential Election Maps, post, I realized I hadn’t included Evan McMullin in my arrays of “right-leaning” candidates. Much to my surprise, adding him to those arrays didn’t change any results (or so I thought). Last night, when I saw that including him may have drastically changed the results, I realized one of two things was true — either my code was broken or my database contained mistakes.

I chose to look into Texas because when I moved my hand, that’s where my cursor landed, showing me McMullin. My results are taken from the PDF from the FEC but, FORTUNATELY, I didn’t go directly to that PDF to confirm results. I also wanted to check party affiliations which I got from Ballotpedia (whom I’ve whined about previously for other issues even before the inaccuracy I just found). Otherwise, I wouldn’t have found some of the groovy things I did.

So, first, I went to the Ballotpedia page for Michigan’s 2016 presidential election results. Much to my relief, the mistake was theirs.

ballotpediaMichigan2016.png
I took all my party affiliations from their Results tables which, at least in this case, differs from the list above.

When I first started this project, I tried using Python‘s Beautiful Soup to grab info like that in the above screenshot from Politico because they conveniently listed every state on a single page. Unfortunately, the code is filled with inconsistencies and invisible crap neither I nor Beautiful Soup could beat into submission. Also, if memory serves, candidates’ names were spelled differently on different state ballots. <– That’s infuriating fact #4,987 on the list.

So I just did some major cutting and pasting to fifty pages I saved from Ballotpedia which sucked in it’s own way because you can’t right-click on their US map to open them in separate tabs — you have to click each one and, after saving the state page, click the Back button to get back to the map.

Before I noticed the Ballotpedia candidate list contained different parties than the results table, I followed the link to their data source (Michigan‘s Secretary of State or, as Ballotpedia calls it, “Department of State”) but when I clicked it, got a 404. Several of the source links at Ballotpedia have the same result but I don’t know whether I should be frustrated with Ballotpedia for having broken links or, as I’d thought previously, frustrated with those states for not keeping their results pages up. My FEC results PDF lists parties for each candidate (but not, much to my chagrin, by state). There I found Soltysik listed as Natural Law Party (which is still kinda conservative, if my recollection is correct) and Socialist Party USA (like the Beach Boys song).

Not yet noticing the mistake in the screenshot above, I set my party affiliation problem aside for the moment and went to Ballotpedia’s Texas page so I could confirm my results (from the FEC) for Castle/McMullin.

ballotpediaTexas2016.png

Ballotpedia doesn’t even list McMullin as a candidate in Texas but does list 51,261 write-in votes. Ever the optimist, I clicked the link for Texas Secretary of State.

TexasSoS2016.png
And, as it turns out, McMullin wallops Castle in Texas.

Black gold! Texas tea! Comprehensive election results, that is!

Note that most of those 51k+ write-in votes are for a single candidate. I think that’s rather significant. If I were the type to post election results, I might consider including that bit of information. Of course, Ballotpedia is probably in the pocket of the Commission On Presidential Debates (who fit nicely in the pocket of Big Insurance who are run by the Illuminati).

Now I was curious if Politico limited their results like Ballotpedia.  I had to go there anyway to see what party affiliation they had for Soltysik anway, so … after finding Soltysik was accurately listed as NLP in Michigan, I saw Politico‘s Texas results were wanting as much as Ballotpedia’s.

TexasPolitico2016.png
These are Politico’s “detailed” results.

Now I was grateful I couldn’t get Beautiful Soup working to my satisfaction with Politico. I’d have missed out on a bunch of candidates!

I still have much digging to do because far too many of my candidates have “null” for party affiliation — not to mention I now know I must fact-check whatever I find. Getting data directly from each state would be best, of course, but since Ballotpedia’s links don’t go anywhere, that won’t be as easy as I’d like.

Advertisements

Final 2016 Election Maps

I am quite pleased with myself and I’ve learned a lot. So much more to do now that these basic options are working but here are the final maps for them (with some updated colors).

legend.png

So I could use the conventional “red state” and “blue state” for options using “All Right Wing” and “All Left Wing” candidates, respectively, I changed Jill Stein (Green) to, well, green and Donald Trump to orange because, you know, he’s orange.

Purpose: My initial idea was based on a question I asked looking at my first ballot voting in Michigan of 1988, “Golly, if all the socialist and communist candidates teamed up instead of having five different candidates, could their combined forces change the outcome?”

So this little app provides answers to questions like — if all “Left Wing” candidates combined their votes, would there be a President Trump?”

First, I need to define my terms and you may or may not agree with them. Some things to take into consideration. I have 31 candidates in my database. So, deciding where they fall on the political spectrum would be as easy as looking at their 31 party platform except that many of them run in different parties in different states. Some parties have different candidates in different states. At least one candidate, off the top of my head, is in one state with a conservative party and another state with a liberal party.

You can’t really use war as a litmus test because you’ll get both Libertarians and Communists in your group. Same thing with legalization of marijuana. Despite Pro-Lifers having the American Solidarity Party as well as Libertarians for Life — unless a Pro-Life candidate is running with a party and right-to-life is part of their platform (ASP), they go in All Right. Marriage, ironically and inaccurately, is a good litmus test. For some reason, many people calling themselves conservatives think we should have laws about marriage — the Libertarians are the only ones who are consistent (on this issue as well as drugs).

All Socialists – Candidates affiliated with a party that have the words “socialist,” “communist,” “workers,” in their name. Or, if their affiliated party platform planks/issues include “socialized healthcare” and “free” college education. If the candidate runs with a socialist party in one state but a conservative party in another* — they’re on my socialist list.

*See _____ post for more on this (and why Soltysik appears as UST in the above screenshot)!

All Left – All of the above as well as what I call “center-Left”. In at least 99% of cases, “Pro-Choice” is a pretty reliable litmus test. Strengthening or expanding the federal government is probably even better.

All Right – Lowering taxes, eliminating taxes and/or Social Security, if they rant on about defense and terrorism (not that socialists and communists are for terrorism, they just feel the “strong defense” stance is an excuse for doing other crap they can’t abide … kinda the way “right wing” people feel about universal healthcare). See Pro-Life paragraph above — they go here by default. Reducing/eliminating national/federal agencies/laws goes here.

There is no “All Far Right” or equivalent to “All Socialists” because nazi or other white supremacist candidates seem to be, collectively, in the closet — in “public” or “political” life anyway. They may show up to parades and protests but won’t have a party on the ballot. I know this because the American Nazi Party was nice enough to answer some questions I had (in 2012 and just a couple weeks ago) about why they weren’t on ballots anymore. The closest I come to this is “All Right Except Donald Trump and Gary Johnson” but that is not to even imply those candidates are nazis or whatever — it’s just the closest this map has that may include those people. Personally — and this is just my theory — those candidates are probably more likely to slip into a mainstream party than a third party.

There are moderate, centrist candidates not included in either right or left. If they have no party (such as an independent) and their website’s description of their stances on issues aren’t very enlightening, they get left out.

Without further ado …

everyone.png
Everyone — popular votes only.

Trump and Clinton win every state in which they run. Clinton gets a relatively light blue because I reserved blue itself for “All Left” and she’s not really left–she’s just leftish of Trump.

allSocialists.png
All Socialists

Jill Stein of the Green Party, whom I consider a socialist (and for whom I voted so just let that help clarify — or collapse — your opinions of me and my opinions in this post) wins every state in which she runs.

noGreen2.png
All Socialists except Jill Stein

Here’s where the socialists lose — because they’re not on the ballot in a lot of states and, even if they’re legitimate “write in” candidates, the public is largely unaware of them — so they lose a lot of states. State laws determine who goes on the ballot and which “write in” candidates are valid when written in.

At present, those two options still use a SQL query to filter the candidates from a database with a table full o’ 2016 election results using various joins and whatnot. For multiple reasons, I use a much simpler SQL query for the rest that gets all the results and I filter them using client-side Javascript methods such as filter, reduce, and map. Very soon, I’ll refactor those first two options to do the same.

Slowing things down a bit is that PHP script that gets all the data with the “everyone” SQL query. That is one of the reasons I’ll soon use IDB to store that data locally with the user. The other reason I’m doing that is I’m very much an offline-first disciple.

allRight_noTrump.png
All Right Except Trump [accidentally excluding McMullin]
If we only consider “right wing” candidates Trump would, of course, win every state. Removing Trump as a candidate, Libertarian Gary Johnson wins every state which is even more boring than “All Socialists” above, so the user can remove Johnson as well.

Update: As it turns out … adding McMullin takes away Gary Johnson‘s 50-state sweep:

allRightNoTrump.png

allRight_noTrump_norJohnson.png
All Right neither Trump nor Johnson [accidentally excluding McMullin]
Darrel Castle (Constitution Party, mostly) wins every state in which he’s on the ballot … wait a second … DAMMIT! Sorry … because Evan McMullin ran as an Independent, I didn’t include him in All Right … I need to go back and change that. After I fix that, I’ll post a new screenshot for this option … I think that might also affect the “All Left vs All Right” map. It’s possible but, I think, unlikely, that it would affect Gary Johnson’s 50-state sweep above.

Update: McMullin doesn’t change changes the above:

allRightNoTrumpNorJohnson.png
As it turns out, McMullin makes a HUGE difference.

allRight_vs_Clinton.png
All Right [accidentally excluding McMullin] vs Clinton
Even without including Evan McMullin, Clinton loses several states when facing the combined forces of the Right Wing.

Update: McMullin doesn’t change the above

allLeft_vs_Trump.png
All Left vs Trump

The combined forces of the Left Wing take Michigan, Wisconsin, Nevada, Pennsylvania, New Hampshire, and Maine from Trump. Thanks, Hillary, you’re refusal to act like a decent human being lost “us” … let’s see … Michigan, Wisconsin, Nevada, Pennsylvania, New Hampshire, and Maine. Let’s not neglect to also thank the DNC for conspiring against Bernie Sanders. Way to go, dumbasses–hope you’re happy with the consequences.  While I’m at it–thanks to the DNC for our involvement in the Vietnam War. If you hadn’t muddled in that election, Nixon would have won and–love him or hate him–he got us out of the mess your guys made so he probably wouldn’t have gotten us into it in the first place.

allLeft_vs_allRight.png
All Left vs All Right [accidentally excluding McMullin]
Even without McMullin’s help, the Right Wing grabs Maine, Minnesota and New Hampshire from Clinton.

So … let me fix my allRight arrays and retake some screenshots.

Update: McMullin doesn’t change changes the above:

allLeft_allRight.png
McMullin adds New Mexico and Colorado!

Aw, how sad for McMullin … I’m sure things would have been different had he been in the primaries. Maybe. Who can fathom Trump even in hindsight?

Soon, btw, the user will be able to do this sort of thing a little easier once the legend is not only dynamic but interactive.

Prez Play Pro Progress

Thus far, if I spend this much time fixing stuff, it’s really frustrating because I’m not learning anything, I’m just trying to figure out what’s broken and why — but it wasn’t my own mistake or lack of knowledge.

This time, however, it is TOTALLY my lack of knowledge so each time I hid a bump, I learn something. Each problem I solve — I learn how and why what I did wasn’t working and how to do it correctly!

A lot of my work super-recently has been learning and using Javascript‘s map, reduce, and filter methods. I was depending on SQL queries (my database being MySQL, if you’re curious … though I was using MariaDB before changing hosting plans) to get the exact results and data visualization I wanted but, for multiple reasons, I’d prefer getting all the results from my database and manipulate the data client-side with Javascript.

Much of what follows is actually from an email I just sent to the American Nazi Party replying to an answer they sent to a question I asked a couple weeks ago. Say what you want about nazis, whenever I’ve asked a question, they’ve sent a polite and meaningful answer. Not all candidates/parties are responsive, let alone courteous. Anyway, I was explaining why I asked my questions and what I was doing.

So, here are my latest screenshots along with what is still in progress, etc.

If only what I consider “right” candidates run — excluding Trump because, of course, he’d win every state, then Gary Johnson wins every state. If we exclude him as well, Darrell Castle (usually Constitution Party) wins all states in which he runs except Arkansas where Jim Hedges wins. The gray states have no conservative/right candidates.

Screen Shot 2018-11-17 at 10.17.24 AM.png

Whatever happened to the Natural Law Party and Reform Party? Maybe the former are all write-in candidates (Darn it! I need to get write-in data!). I know the Reform Party was only on a couple state ballots in the 2012 election — not only that but with two different candidates. I read somewhere at least one state Reform Party nominated a different candidate because, despite how conservative Andre Barnett was (and still is, I assume), they didn’t like that he was (and still is, I suppose) black.

If only “left” or even just socialist candidates run, Jill Stein wins every state except Nevada, South Dakota, and Oklahoma (off the top of my head I think it’s simply because no “left” candidates besides Clinton were on the ballot there). If we only include socialists except Jill Stein, these are the results:

Screen Shot 2018-11-17 at 10.24.07 AM.png

Michael Maturen looks like he wins the most states (I don’t have electoral college votes in the code yet) and there are five tied in second place for number of states. Again, gray means there are either no socialist candidates in those states (except Stein) or, if there are — I need to clean up my data to make sure — there are candidates but none received any votes (which I find difficult to believe).

Once I get these all working, I’ll start adding previous elections and, hopefully, the next one won’t take two years to put together (in my own defense, most of that two years was the learning curve).

Color Coded Candidates

In keeping with the easily-recognized “red state” and “blue state” color code, I’ve made — for the most part — the right and right-ish candidates red,centrist-y purple, left-ish candidates blue, and — again, for the most part — socialists green. It was tedious and awful so I’m not “fixing” anything. Deal.

Screen Shot 2018-10-31 at 4.48.21 PM.png

I ran into a couple issues, however. A couple candidates ran under different parties that are definitely not on the same side of the spectrum.

I could be wrong in my interpretation of where a party may fall on my very subjective spectrum like Chris Keniston running as a Socialist Workers Party and Veterans Party candidate — I might just need to read up more about the latter as well as Emidio Soltysik with the Socialist Party USA and U.S. Taxpayers party. I’d say that I may need to research the U.S. Taxpayers party but Darrell Castle ran under them so … I’m thinking they’re pretty conservative.

Socialist Results for 2016 Election

Finally.

Here are the results if only socialist candidates are considered — whether or not their party has the word “socialist” in the name.

Screen Shot 2018-10-31 at 12.00.11 PM.png
Gray states either had no candidate or any candidates received zero ballot* votes.

You’ve seen that one before.

*I should mention I don’t have any data for write-in candidates.

Here are the results if we don’t include Jill Stein or the Green Party.

Screen Shot 2018-10-31 at 11.59.23 AM.png
Gray states either had no candidate or any candidates received zero ballot* votes.

LEGEND (obsolete but matches above)

  • LIGHT GREEN = Alyson Kennedy
  • LIGHT BLUE = Emidio Soltysik
  • DARK BLUE = Gloria Estela La Riva
  • DARK GREEN = Chris Keniston
  • PURPLE = Monica Moorehead
  • PINK = Michael A. Maturen
  • YELLOW = Lynn S. Kahn

Please note that not all candidates are on all ballots. Most of the 31 candidates for President who were on a ballot were on few states’ ballots. Ooh — there’s an idea for another map! Showing which states each candidate was on the ballot … and I’ll include write-in status. Hmm … I wonder if OpenElections has write-in data. If they don’t, I can totally volunteer!

Anyway, the “problem” isn’t just a lack of unity but also a lack of numbers in the respective organizations — in both individuals and states covered.

Often, a single candidate is running in multiple parties across as many states.

Sometimes, there’s lack of unity within a single party — multiple candidates across as many states.

Anyway, here are the other options that will be available soon (along with an updated legend):

choiceslegend.png

Similar maps for socialist results of the 2012 election.

These are done with JavaScript, D3, PHP, MySQL. Now starting the interactive features.

Semicolons Suck

After days and days and days of debugging and rewriting code … well, it wasn’t a missing or misplaced semicolon … but just as bad. Anyway … here is the Jill Stein landslide in the 2016 Presidential election (if the eleven socialist candidates were the only candidates):

Screen Shot 2018-10-18 at 9.51.39 PM.png

The gray states had zero votes for any (there may have been none — I don’t know off the top of my head) socialist candidates.

Schema Revision and PoliticalPorn

I got to my sillyDayJob early this morning so I could work on my SQL query because using JSON and javascript methods got the same weird results. I went to print out the picture of my revised schema and saw the image was more out of date than I thought so I opened it in Photoshop and updated it.

I keep wanting to just put all of this information in one table but I’m very stubborn about wanting to master SQL for properly normalized databases. While working on that, some lady came into our office and asked if I would be interested in a League of Women Voters guide to amendments on the ballot. Would I? I jumped up and shouted, “And how!” No, actually I didn’t but that’s how I felt. I then really wanted to be all Mad Men and shout, “Thanks, this is sure swell!” because … obviously, I’m just a big geek.

It feels good to be a political geek again.

andHow
Yes, people in Florida can read. Yes, people in Florida vote. *sigh*

Hmm … a couple of those don’t show the table’s primary id column. [Below I have an updated update of the revised revision.]

But … already … I think I know how it should and will work.

In related news, two of my pull requests for Hacktoberfest are for OpenElections repos.

Which … makes me sad … because it makes me think about the fact that if openFEC provided election data through their API instead of jacked up PDFs, etc. then … the fine folks at OpenElections wouldn’t have to work so hard.

Update 13 hours later: Getting closer …

Screen Shot 2018-10-12 at 7.14.39 PM.png

There’s probably a better way to do it, but … it’s working (so far) and that’s all I care about at this point.

Screen Shot 2018-10-12 at 7.14.14 PM.png

Here is the corrected revised revision of the revision.

schema_101618.png