Data Dumpster Diving

I was so geeked, thrilled, stoked, tickled, giddy, and basically just happy when I first started Using the Last.fm and MusicBrainz APIs because, as I noted in that post, while both MusicBrainz and LastFM separate Alice Cooper the person from Alice Cooper the group, LastFM gives them both the same stats for Listeners and Playcount. That made combining the separate JSON files easy.

For some reason that eludes me, I noted that Joan Jett and Joan Jett and the Blackhearts who are also the Same Artist with Different MBIDs on MusicBrainz and LastFM do NOT share the same statistics AND THEN I COMPLETELY FORGOT THAT FACT. I realized it today. I’ve been combining her (and Alice’s) albums from separate JSON files into one but not bringing with them the listeners and playcount stats for artist then deleting the separate files I thought I didn’t need.

So … I need to go back and delete all the rows for Joan Jett because they’re inaccurate.

Here is hoping I have some of them in my laptop’s trash. It doesn’t have a trash can on the desktop so I rarely empty it. I am almost OCD when it comes to emptying the Recycle Bin on my windows PC at work and the trash on my Mac at home (I am hoping maybe there’s one or two pairs there, however).

And I have the screenshot in that “Same Artist” post … so that’s one row I can create manually.

*Sigh*

I just dove into the trash on my laptop … and came out with twenty-freaking-six JSON files! That’s thirteen pair I can add.

I also need to tweak my combine.py file that I am still deliriously in love with and need to write about.

But then I just checked my artistsLastFM table (where I store the LastFM data) filtering with the MBID for Joan Jett and the Blackhearts from the artists table so I could see if I am still missing any days for Miss Jett … and it returned zero results.

No need to be scared … because I have all the beautifully stinky JSON files from the trash … and I can check using her “person” MBID because maybe …

… and YES! There are 16 rows …

And guess what? One of those “extra” rows is for the date of the post with the screenshot so I can fix that, too!

I have never been so happy to do manual work. And I’m gonna back up these files like right NOW.

Advertisements

Data Science Isn’t Always Sexy and Glamorous

Sometimes data science is a bunch of debugging and fact-checking.

Bit o’ trivia about me: I got into this because I wanted to start using open data and APIs instead of constantly fact-checking frequently inaccurate data from my co-workers. Now, I find I have to fact-check my data. Not sexy. Not sexy at all.

I’ve wanted to write this post for a long time but, until now, I felt it would seem like mere whining … or conspiracy ranting (you’ll see).

Gathering complete, comprehensive election results seems an impossible task. It’s almost (get ready) as if somebodyThey … don’t want us to have them (imagine my voice in any shrill tone you like).

It should be easier. Much easier.

I promise I won’t complain about every little thing because I’ve complained about some of these things before — such as

  • The FEC provides Excel files for some elections but only PDFs for others
  • The FEC API doesn’t provide access to election results at all
  • The aforementioned results aren’t available on their website for months
  • Politico has results in real time and even updates them for a couple days after the election but the page is then like an operating system update or rendering video … it stays at 99% complete forever. Even now, as of December 10, 2018, Alaska shows only 99.5% precincts reporting.

Last night, I opened VS Code to continue adding features to Election Insights (the web app formerly known as prezPlayPro) when I noticed a few things were somehow broken since my triumphant post of November 23 — not only did some results change but some maps were downright broken.

Long Story Short: I now test everything using a private or Incognito window so I can be at least a wee bit more sure I’m looking at the latest code. Results that I expected to change after a fix made 2-3 weeks ago finally showed up. So, nothing was broken or causing incorrect results but I only know the “new” results are accurate because I did some digging in several piles of data to confirm … digging which should have been easier.

I had two potential problems I needed to investigate. Two questions needed answering:

  • Did Evan McMullin really beat Darrell Castle in a buttload of states?
  • Why do I have two candidates (Darrell Castle and Emidio Soltysik) affiliated with the U.S. Taxpayers party in Michigan?

In my previous, Final 2016 Presidential Election Maps, post, I realized I hadn’t included Evan McMullin in my arrays of “right-leaning” candidates. Much to my surprise, adding him to those arrays didn’t change any results (or so I thought). Last night, when I saw that including him may have drastically changed the results, I realized one of two things was true — either my code was broken or my database contained mistakes.

I chose to look into Texas because when I moved my hand, that’s where my cursor landed, showing me McMullin. My results are taken from the PDF from the FEC but, FORTUNATELY, I didn’t go directly to that PDF to confirm results. I also wanted to check party affiliations which I got from Ballotpedia (whom I’ve whined about previously for other issues even before the inaccuracy I just found). Otherwise, I wouldn’t have found some of the groovy things I did.

So, first, I went to the Ballotpedia page for Michigan’s 2016 presidential election results. Much to my relief, the mistake was theirs.

ballotpediaMichigan2016.png
I took all my party affiliations from their Results tables which, at least in this case, differs from the list above.

When I first started this project, I tried using Python‘s Beautiful Soup to grab info like that in the above screenshot from Politico because they conveniently listed every state on a single page. Unfortunately, the code is filled with inconsistencies and invisible crap neither I nor Beautiful Soup could beat into submission. Also, if memory serves, candidates’ names were spelled differently on different state ballots. <– That’s infuriating fact #4,987 on the list.

So I just did some major cutting and pasting to fifty pages I saved from Ballotpedia which sucked in it’s own way because you can’t right-click on their US map to open them in separate tabs — you have to click each one and, after saving the state page, click the Back button to get back to the map.

Before I noticed the Ballotpedia candidate list contained different parties than the results table, I followed the link to their data source (Michigan‘s Secretary of State or, as Ballotpedia calls it, “Department of State”) but when I clicked it, got a 404. Several of the source links at Ballotpedia have the same result but I don’t know whether I should be frustrated with Ballotpedia for having broken links or, as I’d thought previously, frustrated with those states for not keeping their results pages up. My FEC results PDF lists parties for each candidate (but not, much to my chagrin, by state). There I found Soltysik listed as Natural Law Party (which is still kinda conservative, if my recollection is correct) and Socialist Party USA (like the Beach Boys song).

Not yet noticing the mistake in the screenshot above, I set my party affiliation problem aside for the moment and went to Ballotpedia’s Texas page so I could confirm my results (from the FEC) for Castle/McMullin.

ballotpediaTexas2016.png

Ballotpedia doesn’t even list McMullin as a candidate in Texas but does list 51,261 write-in votes. Ever the optimist, I clicked the link for Texas Secretary of State.

TexasSoS2016.png
And, as it turns out, McMullin wallops Castle in Texas.

Black gold! Texas tea! Comprehensive election results, that is!

Note that most of those 51k+ write-in votes are for a single candidate. I think that’s rather significant. If I were the type to post election results, I might consider including that bit of information. Of course, Ballotpedia is probably in the pocket of the Commission On Presidential Debates (who fit nicely in the pocket of Big Insurance who are run by the Illuminati).

Now I was curious if Politico limited their results like Ballotpedia.  I had to go there anyway to see what party affiliation they had for Soltysik anway, so … after finding Soltysik was accurately listed as NLP in Michigan, I saw Politico‘s Texas results were wanting as much as Ballotpedia’s.

TexasPolitico2016.png
These are Politico’s “detailed” results.

Now I was grateful I couldn’t get Beautiful Soup working to my satisfaction with Politico. I’d have missed out on a bunch of candidates!

I still have much digging to do because far too many of my candidates have “null” for party affiliation — not to mention I now know I must fact-check whatever I find. Getting data directly from each state would be best, of course, but since Ballotpedia’s links don’t go anywhere, that won’t be as easy as I’d like.

Promises for Five-Year-Olds

This is a work in progress …

I’m happily going through the Grow with Google Challenge Scholarship: Mobile Web course at Udacity. We use javascript promises throughout which I kinda sorta basically understood the basic concept of. I didn’t completely get it but I knew we’d cover it later and everyone in the forums raved about Udacity’s standalone JavaScript Promises course so I planned on taking that after finishing Mobile Web. I wasn’t worried. I understood the structure just enough to put things together and pass the quizzes (eventually) by basing my answers on examples we were given.

I got the How (they worked structurally) but not the What (they were doing) or the Why (would I use them versus callbacks and/or Event Listeners).

I got to the Promises part of Mobile Web and it still didn’t really make sense to me. So I stepped out and started the JavaScript Promises course. That was great for a few minutes then I was lost again. I can’t tell you how many times in both courses I’d said, “Wait, wait … what?!”

This isn’t a criticism of those courses and sections. My learning style is just incompatible with certain teaching styles. I don’t do well when people say things like, “A promise is a promise that returns a promise—get it?”

No. No sir or ma’am, I don’t.

I posted a question in the forums, “ELI5 How Promises Work” and gave a couple examples of how I thought they worked based on the only thing I really knew about them – they were an ES6 replacement for callbacks … right?

I didn’t wait for answers because I was scared I’d get buried under other terminology and crap that meant nothing to me and would only confuse me more. I grabbed my Google Search box by the throat like Jacob wrestling with God and shouted, “I’m not letting go until you explain promises to me!”

The MDN documentation had me for just a moment or two as well but also lost me.

For the life of me, I couldn’t understand how Promises were related to Callbacks when (I thought) one was replacing the other and it seems to me callbacks were doing all the work anyway. And … eventListeners, right?

The ridiculously awesome Jake Archibald wrote a legendary thingy on Promises that finally started me down the road to understanding. First of all, I realized callbacks aren’t asynchronous … they’re merely blocking, meaning … well, I love Jake’s explanation (which I’ve paraphrased a wee bit):

Sneezing is a blocking function. All current activity must be suspended for the duration of the sneeze. You don’t want to write code that’s sneezy.

That made perfect sense to me. Callbacks aren’t asynchronous … nothing is happening and I’ll do other stuff while it’s happening and when it’s done it will let me know … everything stops while I do this thing and when I’m done doing this thing everything starts up again. Got it.

Jake also said,

“Promises are a bit like eventListeners except … we’re less interested in the exact time something becomes available than reacting to the outcome.”

Okay. That makes sense. Jake also gave a couple examples that, at first, smacked me into “Wait, what?” land again but I took a deep breath and re-read it … and felt I had a loose grasp on it all. I’ve been there before, though, and those cookies of understanding can crumble and fall through your grasp right before your eyes.

Sadly, Jake soon lost me.

As it turns out, Promises are pretty freakin’ easy … until people try explaining them. Eventually, I found two resources that changed all that. The two most valuable things I dug up:

Don’t just read the chosen answers at each. I read through all the comments and answers and replies and they are both a treasure trove.

Even just the titles made me feel much better about my confusion.

Aviv Cohn, the author of Aren’t Promises Just Callbacks? gave this ridiculously short and simple of a callback in his question and all of a sudden I finally understood callbacks so I made some real progress even before reading the replies! Cohn then explains promises in his own words,

“A Promise is an object that represents a value which might not yet exist [emphasis mine]. You can set callbacks on it, which will be invoked when the value is ready to be read.”

That sentence immediately made sweet love to things Jake had said and gave birth to some more understanding on my part. Cohn then gave the promise version of his callback code and concluded by asking,

Is there actually a real difference? The difference seems to be purely syntactical.

The first response started to beat the crap out of me with,

“Yes: callbacks are just [insert crap I couldn’t understand]. Promises are [more crap I didn’t understand] … a composable mechanism to chain operations on values.” [emphasis mine]

Oh. Okay. I don’t know what “composable” means (more on that later) but the rest of the bit in italics made more stuff click into place. Another answer blessed me with,

It is fair to say promises are just syntactic sugar. Everything you can do with promises you can do with callbacks … The deep reason why promises are often better is that they’re more composeable, which roughly means that combining multiple promises “just works,” while combining multiple callbacks often doesn’t.

First of all, thank you for defining “composeable.” Second, this person then goes on to drop a whole bunch of other magical wisdom explaining the difference and how they work and why — in freaking English — including working with values and errors and this:

For the example you gave of a single callback versus a single promise, it’s true there’s no significant difference. It’s when you have a zillion callbacks versus a zillion promises that the promise-based code tends to look much nicer.

Lost My iShizzle

The iMac has been freezing lots again. So I updated El Capitan — not to Sierra, just updated the Capitan. Googling Sierra tells me there’s lots of freezing for that, too. Perhaps I should have stayed with Mavericks. Man … I used to be such an early adopter. All through the first several “cat” years of OSX.

Updating from .11 to .12 didn’t seem to work. Perhaps because the machine is so glitchy.

So I referred back to my Adobe CC and El Capitan post hoping those steps would help as much as they did almost a year ago. If memory serves, this nonsense also happened last time I was about to start some contract work.

Not only did it not work, but now the iMac dies during every attempt at rebooting. The progress bar hangs at about 50% for a bit, then the machine completely dies.

Trying First Aid in Recovery Mode

There are two “things” on the left (the second thing is an “image”) and each has a sub-thing. Three of them pass First Aid just fine but the first “sub-thing” (Macintosh HD) fails.

I tried using the Terminal in Recovery Mode to Repair Permissions but Ye Olde Terminal tells me there’s no such command as “sudo.” This worked like a charm last time. Is it because I updated? Or is the machine broken? As an experiment, I tried another command (I can’t remember now) which it also didn’t recognize. It did recognize “ls.”

One post I read said one could Reinstall OSX in recovery mode and it wouldn’t delete your files and such (I should really start using the external HD I have that sits next to the iMac) but when I tried that, it told me the disk was locked.

I found a couple great, in-depth posts somewhere and tried many of the suggestions with no luck.

No Genius Bar Love

Setting an appointment online is impossible. Remember when apple.com was considered one of the best-designed sites on the web? The most (but definitely not the only) infuriating thing about trying to set an appointment using their website is, upon clicking the desired time, I receive an error alert stating, “You haven’t entered a valid email address.” There is NO form or field for entering an email address.

Setting an appointment via phone is impossible. I call the store. Of course the voice recognition system doesn’t recognize anything I say. Eventually, it transfers me to Apple Care and starts asking me more questions about unrelated crap and doesn’t understand my answers.

Setting an appointment in person is stupid. You can’t drop off your computer and have them diagnose and/or repair it. You have to make an appointment to drop it off. You have to drive to the mall, make the appointment and then, later, go back to your car or back home, and get your computer.

It’s the Economy, Stupid

All these people who are unemployed. But businesses like Apple can’t afford to hire any humans to answer the phone in their stores? Walmart can’t find anyone qualified to operate more than three cash registers at a time?

Buy Local

Which brings me to the new computer repair shop on the corner … which brings me to my next post …

Gratitude

Fortunately, my “stupid day job” that I constantly complain about provides me with a laptop loaded with Adobe CC, etc. so I can still do my side job … every day I find another reason I should really be grateful for my “stupid day job” but I just can’t. Something’s wrong with me.

File Uploading Madness

The web app on which I’m working will include (sorry for the passive voice) a page for the user to upload an “observation.” As of last week, this small form included:

  • Two menus (Case, Action) from which the user selects … you could think of them as categories or tags
  • A text field to contain a description of the observation
  • Radio buttons that basically state if there are pictures available to support the text that aren’t uploaded (I thought the initial version would be just for text)
  • A submit button

insertObserveSevereCropNoFileInput.png

I came across a tutorial or two showing how “easy” it was to incorporate a file upload button. So I wrote that code and proceeded to test:

  1. It didn’t work
  2. I modified the code
  3. I tested again
  4. Lather, rinse, repeat

After spending hours removing everything (security, etc.) except the bare-bones script to upload the file, and re-testing the form over and over again, I suddenly noticed the Case menu was missing.

So I thought I’d spent hours trying to fix something that wasn’t broken (which happens frequently) because of something I’d never expect. Of course the form wasn’t working, one of the fields wasn’t contributing anything to the query.

Bit o’ Sidebar: Something you need to know is I often struggle with Dreamweaver’s FTP — I often have to close and reopen Dreamweaver to get it working again or use my host’s browser-based upload thingy. But if I’ve closed Dreamweaver and have to reopen it to do some editing so the FTP is “fresh” I’ll use that.

Now my pattern became removing or moving or rewriting the file-upload code, uploading the file to see if made a difference, and so on … based on a couple tutorials I’d found, I thought it was Bootstrap causing this either/or situation — I could either see the Case menu or have the file-upload button, but not both.

Eventually, I realized something even weirder … uploading my PHP file using Dreamweaver, the menu wouldn’t display. Uploading it via the browser, it finally did. Unfortunately for me, just by coincidence (sort of), whenever I put the file-upload field in, I was using the browser to upload my PHP file so the Case menu would appear but when I put the code back in, I was using Dreamweaver so there was a correlation but the cause wasn’t the code, the cause was, somehow, Dreamweaver although I have no explanation for that. Seems stupid but so did the other problem or two.

I still can’t get the file upload to work but at least I can see all the pieces that may or may not be broken.

I kept thinking I should just build a page that just uploads a file with nothing else to see if it would work but I kept thinking, “No, this is a simple problem, any second now, I’ll find the solution … building something from scratch would waste too much time.” Meanwhile, I waste hours … this happens far too often.

So now I’m taking the four best tutorials and I’m going to create each of those things from scratch with none of my code-baggage and see what happens.

Close To Wit’s End

I should know better than to try editing/testing code on this particular computer/network. Stuff that works anywhere else doesn’t work here. So, it might be that.

But this new script is so simple.

But, if it works, then I make a change and that doesn’t work so I undo the change … then the previous working thing doesn’t work anymore.

I never know what’s really working and what isn’t.

This happens ALL the time. It will work fine … then it won’t.

Then I questioned whether it ever functioned correctly so maybe I’ve been basing my recent work on incorrect information so I have NO F**KING idea what I’m doing!

How can I f**king learn if I never know what actually works and what doesn’t? What was actually working or  not?

DAMMIT!

Update: I did a “Save As” to change the js filename from 02 to 03, updated that in the html, then when 03 didn’t work … I, um … didn’t change the reference in the HTML file back to 02 … and, that’s … um … Windows sucks!

Holy Crap. Am I in the Overlook Maze?

It’s been so long since I switched this project from Python to JavaScript, I’d forgotten why.

On Oct 24, 2016 — over three months ago — I gave up on doing it in Python because–wait for it–I couldn’t get the Track Popularity to work!

I wanted to … have a list of albums showing their popularity and the songs showing their popularity but, for some reason, I can’t get my Python code to get the track objects containing the track popularity. I can do it if that function is it’s own thing but as soon as I make it part of a loop it won’t work anymore.

The exact same problem, as it turns out, that I’m having with JavaScript!

That tells me something … and I wish I knew what that something was.

[insert vulgar exclamation here]

I thought what was an elusive bug in Python, because of my inexperience, would be obvious and easy in JavaScript.

Ugh! What am I missing?!

Maybe I should rethink how and why I’m doing this … is there an easier better way? Even if there is, I want to solve this problem! I have to. There has GOT to be a way to do this. I simply can’t believe that isn’t true.

Update: Okay, I’ve identified the actual last working script, which isn’t the one(s) I’ve been trying to “fix.” It wasn’t v4 or v5 of either … nevermind … it’s pseudo_10d (github). Maybe those others were based on it at some point — I honestly can’t remember.

But it only logs everything to the console. It works, but it’s just logs. My goal is to get it all into a neat little object. The reason for that is I think that’s the easiest and best way to then put all this data into my MySQL database. Is that the easiest and best way? Here’s the deal … here’s what I’m doing …

  1. Get json from Spotify
    • Artist Info
    • Artist’s Albums
    • Several Albums
    • Several Tracks
  2. Get data from those json requests and organize it in a javascript object that looks like the json spotify sends (but I have to make at least two “several album” requests and multiple “several tracks” requests to get each piece of data I want because some requests get small album or track objects which I then use to request large album or track objects … if spotify included popularity in the default small album and track objects NONE of this would be an issue and I’d have been done months ago).
  3. Put that data into a database

What I’m wondering is if I need to have three steps … maybe I don’t have to store and organize any of it in an object … maybe I can just shove it straight into the database. I’m sure that’s possible … certainly not elegant … and, after I initially build all the tables for artists, albums, and tracks, I don’t need to worry about it, but … dammit, I want this tiny little problem solved before I move forward with those other ideas.

TL;DR even if it is possible to do this more easily, I want to solve this problem.

Now I’m wrestling with … do I start trying to make 10d build the object? Or did I already do that and that’s how I made the messes I’ve been drowning in?

Wait, yeah … okay … here’s my hesitation … all along … and maybe it’s unfounded … I can only base decisions on what I know … perhaps I should have tried what I’m about to say a long time ago …

I keep being, I think, too judgmental and hard on myself about my code … it’s not elegant enough … other’s will say, “What the f**k did you do it THIS way for?” and then laugh me out of the building back into the newbie ghetto.

Screw it. I need it to work. I’m sure that’s what they at real jobs, right? I just have this idealized image of how people work and think at “real jobs.” I’m so used to working with lazy, apathetic … I don’t want to be seen like that … as someone who says/thinks “this is good enough” and doesn’t strive for excellence and awesomeness. When I finally get a coding interview, I want them to say, “Holy f**k! We’d better grab this guy before somebody else does!” instead of … making me feel stupid for even trying.