PDFtk on CentOS7

Months ago, I tested it with positive results on either Mac OSX and/or Ubuntu 14.04 so now I’m gonna test it on the new CentOS7. If it doesn’t work, I’ll either wipe the VPS and install CentOS6 or Ubuntu 16.04. First, I have to remember how I used the templates I created to test it. Ugh.

Watching The Defenders on Netflix, btw. I wanna like Iron Fist so bad. Loved the comics. Iron Fist, that is, not the Defenders. Except for Defenders: Indefensible.

So, testing process is:

  1. Get form field info using dump_data_fields
  2. Export fdf from a filled-in form using generate_fdf
  3. Import that fdf data to fill in an empty version of that same form via fill_form

I have a PDF form that, filled out, looks like this:

AlbumForm

So, first, I need to get the names of the form fields.

pdftk AlbumFormEmpty.pdf dump_data_fields >AlbumForm.txt

That was successful. Yay. Result looks like this:

textOutputDump

Now, I’ll export the data from the filled-in form’s content — not normally a part of this process, but it’ll quickly get me the data and other shizzle I need for testing.

pdftk AlbumForm.pdf generate_fdf output AlbumData.fdf

Cool cool cool, that worked fine. It’s hideous, but I have it.

Later, I will/would write a PHP script to create such an fdf file out of data from an html form but for now, I’ll put that fdf back into a PDF form:

pdftk AlbumFormEmpty.pdf fill_form AlbumData.fdf output filled.pdf

Perfect. Awesome.

One of the places where I learned stuff:

https://www.sitepoint.com/filling-pdf-forms-pdftk-php/

Advertisements

Jay Pride Parade

I am so freaking proud of this dynamic PDF using templates with dynamic menus and what I’ll call “smart redaction” I just can’t stand it. I feel more creative, powerful, geeky, as well as downright handsome and sexy because of this I can’t even stand it. I’ve made several interactive PDFs that I thought made the world a significantly better place but those just took time and effort. I’m creative and knew how to do what I wanted. This project, however, required I learn a bunch and solve a whole list of problems other people all over the web were struggling with and some people said these things couldn’t be done. I am beating my chest and yelling like Tarzan right now.

The client had a list of what they wanted/hoped the PDF would do and each of them was a challenge by themselves, let alone getting all these “features” to work together and in a way that didn’t make the user cry. Lots of Googling, forums, documentation, taking pieces from multiple solutions and combining them, followed by lots of experimentation, trial-and-error, debugging and falling in love–deeply and passionately–with Acrobat‘s JavaScript Console.

The client is a private investigator who has been assembling his reports in (God have mercy on his soul) MS Publisher. He asked for what sounded pretty simple: templates for PDF forms. It got complicated surprisingly quickly and, as I said, I found that others have faced these problems and I had them all together. Below are some of the challenges I’ll document along with their solutions in the next few posts.

The Client’s Hopes and Dreams

Which came true cuz I’m totally the Fairy Godmother of Adobe Acrobat

Templates in the Investigation Report include, among others, a Subject Background form and an Investigative Action form.

  • Each of those (and other) forms/templates need to be used multiple times in the same report so, as you may know, the form fields on each spawned page need to have different names so they can hold unique values. Not a problem, right? Wrong.
  • There’s a main form I’ve called the Report Dashboard in which the client enters the pertinent subject and locations so they are available in “Subject” and “Location” menus in the various templates/forms. Challenge #1 Populating fields based on a menu choice is easy but populating a menu based on text fields was new to me. Still not too bad, though, right? Wrong. Challenge #2 The dashboard might be updated after those pages (containing menus with unique names) are spawned.
  • Other information on the dashboard appears on all pages’ footers such as Case Name, Case Number, Date, and Page Number. Challenge #3 Unique field names meant I needed to be creative (because the dashboard content such as date completed/submitted and such might not be entered until the end of the investigation) but that was (truly, this time) not bad. Challenge #4 User needs to be able to rearrange and remove pages at will. This affected several things including but not limited to the page numbers. While the client owns Acrobat Pro …
  • Oh, and all of this needs to be possible after the client has saved, closed, and reopened the file as well.
  • I still wanted the client have as little burden as possible. I wanted a design and process that was elegant, easy and simple. If he had the time and inclination to learn and master Acrobat’s innards, he wouldn’t be paying me, right? Right.
  • Redaction. Never a fun topic for anyone, apparently. This client’s situation is particularly unique. Nobody wants to pay a private investigator thousands of dollars for a report that says, “I didn’t find anything useful” so my client’s customers want to see they’re going to get something for their money. If the investigator shows the fruit of their labor, however, their customer can then just run with that information to their lawyer (another expensive expense of the client’s customer) so they have no compelling reason to pay my client–at least not in a timely or easy manner. So, I need to creatively redact information while still showing there’s something substantial … in a way that doesn’t make me want to cry like others suffering under a requirement for redaction.
  • Security. Related to redaction but with other, more typical concerns like preventing printing, watermarks, and of course preventing changes … all without requiring my client learn or do any more than they have to.

This project kicked so much ass because I kicked its ass so thoroughly. My client suggested I license or sell this template/report to thousands and thousands of investigators like him who share his struggles and I’ll do what I can to profit further from this effort but, while he was saying that, all I was thinking was I couldn’t wait to share everything I learned because, you know, OPEN SOURCE, BITCHES!

Generate Multiple Certificates Using a PDF Template and JavaScript

Just did something at work I thought was pretty cool so I whipped out this crude tutorial.

TL;DR

Using JavaScript in a PDF to quickly and easily generate multiple (20-30) course completion certificates each class.

Pre-Requisites

You know how to create form fields in Acrobat.

The Problem:

I designed a certificate template for a teacher to distribute to her students. She loved it but then, basically, asked, “So do I just have you make a bunch of these and type their names in them so I can send them to the printer each time I have a class?”

No. The answer was most definitely no. That wasn’t going to happen. I didn’t want to simply tell her no, however, and seem like an uncooperative team player (which is the only possible outcome of me stating that was her job, problem, circus, monkeys, etc.).

Initial Idea:

Fortunately, I know Acrobat has a good relationship with Javascript. My first idea was to tell her she could import a text file containing a list of recipients and click a button to generate the required certs with their names and an automatically generated date—I would create the dynamic form fields, button, and Javascript.

First Obstacle:

That method requires the text file to be tab-delimited requiring her to know enough Excel for that. Thus, importing the list is unacceptable because that means telling her she has to do something and learn something. Customers don’t like doing and learning things.

Second Idea:

Creating an Acrobat form in which she could type or paste a list of names—theoretically, I thought, the button she’d then click would convert that list into an array and puke out a bunch of certificates. This required some thought and research on my part but I figured

  • the idea of typing/pasting a list in a single field would be much faster and more reasonable to ask for than separate fields for each name
  • One field is prettier than a bunch of fields
  • Number of students vary so there would either be too few fields or I’d have to figure out an elegant way to dynamically create more as needed and, oh geez, it’s already a pain in the butt.

Final Idea:

While I mulled over how to do that in a way that didn’t suck, I put together a prototype for a single name so I could hammer out the process and code. As it turns out, the prototype worked far faster and better than the other solutions. Check it out…

Step #1: Create the Certificate

Basic “background” created in Photoshop. I had the Director write his signature large on a blank sheet of copier paper with a sharpie, scanned it, and changed that layer’s Blend Mode to Multiply. You’re welcome.

forTute_01.png

Saved the PSD as a PDF named GreatCertGenerator.pdf and, in Acrobat, added two text fields that would populate dynamically: recipientName and date.

Step #2: Create the Form

Just laid this out quickly in Photoshop.

forTute_02.png

I also created button, which you’ll see in a moment, in a separate file. I’ll explain why the button is separate very soon. Saved it as CertForm.pdf and imported it into GreatCertGenerator.pdf  using the following lame, convoluted method (Acrobat, like MS Office, becomes more of a pain in the ass every single version):

  1. Click the Tools tab (Grr … it makes me angry even just writing it).
  2. Click Organize Pages (another step that didn’t used to be there!).
  3. Choose Insert > From File and navigate to CertForm.pdf, select Before Page 1 and click OK.
  4. Added two standard input text fields: singleName and myDate.

Originally, the form had only the Name field—the date was generated automatically on the certificate using the following code on page 1 (so it would happen as soon as the document opened):

var x = this.getField(“date”); x.value = util.printd (“mmmm dd, yyyy”, new Date())

However, I wanted to give the user the option to type in their own date if they’re reprinting an old cert or if they’re not creating these the day of the class so I created the myDate field in the form to “receive” the above code instead and, in the first line, changed date to myDate. The final version of the code in page 1’s Page Properties > Actions > Run a JavaScript looks like this:

var x = this.getField(“myDate”); x.value = util.printd (“mmmm dd, yyyy”, new Date())

inContext.png
Both pages visible–before making page 2 a template and hiding it–and the green AddGrad button created in Step #4.

Step #3: Make the Certificate a Template

  1. On the certificate page (page 2 for me), click the Tools tab.
  2. Click Organize Pages.
  3. Choose More > Page Templates.
  4. Type a Name for the template (mine is “EmptyCert”).
  5. Click Add and click Yes when asked to confirm using the current page. EmptyCert is now listed among the Page Templates.
  6. Poke EmptyCert in the eye to toggle visibility. The page disappears from Organize Pages as well as the Page Thumbnails panel. This makes the process all magic and mystery for the users.

AddPageTemplate.png

Step #4 Create the Magic Button

Because I’m assuming you already know how to create forms in Acrobat and I don’t want to type the lame, convoluted way you have to do it now because I’ll start ranting, I’ll just show you how I made my button prettier than the average button and added the JavaScript that generates each certificate with the name and date. First, I created the button in Photoshop and saved it as a PDF (see Step #2).

For_Butt.png

In Button Properties, under the Options tab (not the Appearance tab):

  1. I made the text in Photoshop, so I chose Icon Only for Layout.
  2. I chose Push for Behavior. You could, instead, choose None and, instead use separate images for each State in the Icon and Label section.
  3. Click Choose Icon and navigate to your button (which must be saved as a PDF).
  4. Under the Actions tab, the Trigger is Mouse Up and the Action is Run a JavaScript (see code below).
  5. Click Close.

Here’s the code and explanations a piece at a time:

Identify the template.

var a = this.getTemplate (“EmptyCert”);

On the template, we’ve got a couple form fields waiting as placeholders.

Fill recipientName on the template with the contents of singleName in the form.

getField(“recipientName”).value = getField(“singleName”).valueAsString;

Fill date on the template with the contents of myDate in the form.

getField(“date”).value = getField(“myDate”).valueAsString;

Create (“spawn”) a new page based on that template.

a.spawn();

Step #5 Creating the Over-the-Top Awesomeness

New Problem: When the user clicks Add Grad, they’re jumped to the new page/certificate. I hated the fact that the user must then stop, grab the mouse, click the first page in the Thumbnails panel to return to the form, click the Name filed, then move their hands back to the keyboard to type the next name. Life is way, way to freakin’ short for that nonsense.

One page 2 (the cert), I created an invisible button called Return. It’s invisible because the user never needs to click it so they don’t need to even know where it is. On MouseUp, the Action is Go to Page 1. Behold this line of JavaScript in Page 1’s Page Properties is:

getField(“singleName”).setFocus();

So the PDF works like this at ludicrous speed: When the user opens the doc—the Name field is autofocused and ready.

  1. User types the Name, presses Tab and Enter to generate the cert
  2. New cert opens—user presses Tab and Enter to return to the form (GoTo step 1)

Lather, rinse, repeat (really, really fast).

GloriousCerts.png

After creating all the certs they need, the user saves the PDF and emails it to their printer.

Cheat Sheet Pet Peeve

You know what I hate almost (but, really, nowhere near) as much as criminals who aren’t moral and trustworthy enough to be child-molesters so they pursue calling PDFs “eBooks” as their super-villain avocation?

People who claim to provide “cheat sheets” that aren’t really cheat sheets, that’s who. Ever seen those cool cheat sheet pages in the Dummies books that are even perforated for easy removal? I thought the so-called Python for Data Science for Dummies “cheat sheet” would be like that.

Let me count the ways it’s not a cheat sheet:

  1. It’s not print-friendly.

That’s the only reason I need. I’ve clicked on far too many links lately being suckered into thinking I’m going to have a groovy, printable cheat sheet for Linux commands or some other topic and end up with the above or worse.

Let us count the ways a cheat sheet is a cheat sheet:

  1. Something you can conveniently have on hand (hence perforated) like, you know, a little sheet of paper up your sleeve during an exam.
  2. Nice page layout for quick & easy reference

That Dummies monstrosity has neither property. You know what file format that is not HTML that has both of those properties in, like, spades? PDFs.

This Conda Cheat Sheet (PDF) is a proper cheat sheet and even a dummy should be able to tell the difference.

Update Dec 15, 2015:

I find it pretty darn ironic that Dummies has the worst so-called “cheat sheets” given their books’ cheat sheets are so awesome. While searching for a perfect reference for Linux commands, I found their equally ugly Common Linux Commands page which, along with being hideous and useless has auto-play videos.

PDF Pet Peeve

I love Acrobat and PDF. It was my favorite class to teach when I taught full-time. Fully-featured PDFs are a joy and a sign of a truly civilized culture. The only thing that makes me sadder than PDFs without bookmarks are when savage, barbaric, lying, incompetent, sleazy, stupid, booger-eating butt-faces call PDFs “ebooks.”

A PDF is not an eBook.

Please write your elected officials and request they sponsor or support legislation making it legal to punch people in the neck when they call a PDF an eBook.

Until we make this crime against humanity punishable by law, help me build a grass-roots movement either:

  • Non-violent protesting outside the offices of such evil-doers, marches, letters to the editor, or any effort no matter how small
  • By any means necessary, punching people in the neck or otherwise creating consequences for such irresponsible and cruel behavior

Edgar Allan Poe Interactive Family Tree

Another interactive PDF created with Adobe Illustrator and enhancing in Acrobat with links, Show/Hide fields, buttons, and layers.

I need to see relationships. I like patterns. In addition to the bookmarks and layers panels, the names at the top as well as the symbols beneath them open many different views.

Interactive Edgar Allan Poe family tree

Like two of the interactive Poe city maps, much of the content also applies to Susan Archer Talley Weiss.

Most features are simple tool tips and pop-ups:

FamilyTree_demo_00

The Connect symbol offers a list for things like seeing how Edgar Allan Poe was related to his cousin/wife. The example below shows how Poe was, in fact, related (albeit distantly) to Ms. Talley Weiss.

Edgar Allan Poe interactive family tree showing relationship between Poe and Susan Archer Talley Weiss.

Maps from the Vault

These are PDFs, not web pages, but I am nonetheless proud of the concept and final product of these interactive maps. Beginning with maps from Edgar Allan Poe, the Man by Mary E. Phillips of the Richmond, Boston, Philadelphia, and New York (didn’t get enough of Baltimore completed to bother showing) of Poe’s time, I created vector versions in Illustrator, exporting them to PDF so users could click dozens of locations to see pictures and read additional information.

I like to think they demonstrate my commitment to accuracy, comprehensive information, detail, ease-of-use, and beautiful, meaningful data visualization.

Created with:

  • Adobe Photoshop (assembling maps from different sources, cleaning originals)
  • Adobe Illustrator (Creating cleanest, vector final versions with layers)
  • Adobe Acrobat (adding interactivity)

I can’t wait to make stuff like this exclusively with code.

Richmond, VA

Interactive map of Edgar Allan Poe's Richmond.

They’re all filled with hyperlinks to web sites as well as geographical features and information I added from my own research. I kept some of the original scanned maps–mostly the cool title banners as you see in the upper-right of the image above.

I’d only recently discovered PDF layers and was deliriously happy because they’re so much more efficient than Show/Hide Fields.

RichmondDeluxe1-0_02

Opening the layers panel and clicking a visibility toggle, displays, for example, commutes like this:

Map of Edgar Allan Poe's Richmond showing walk from Duncan Lodge to Elmirah Shelton's house.
Walking from Duncan Lodge to Elmirah’s house.

I started making these maps for my own reference while writing about Poe so I could know what he saw and when he saw it and other day-to-day stuff so commutes were really important to me.

Interactive map of Edgar Allan Poe's Richmond showing Susan Archer Talley's home, Talavera.

If I couldn’t find pictures in the Public Domain, I made sure to alter them enough to make them my own. Regardless, I cited sources. The following includes a combination of images from two different sources that served quite well to complement each other.

Interactive map of Edgar Allan Poe's Richmond showing possible death place of his mother, Eliza Poe.

Drawings and close-up callouts were another piece of the original maps I kept.

RichmondDeluxe1-0_should_be_05

Philadelphia, PA

In addition to layers like the Richmond map, I’m very proud of the UX/UI in this one. There’s the basic, opening view:

Interactive map of Edgar Allan Poe's Philadelphia initial view

The initial view lists landmarks by number in the legend. The user finds them by displaying numbers on the map. Zooming in prevents numbers on the map piling on top of each other.

Interactive map of Edgar Allan Poe's Philadelphia displaying numbers in legend for Landmarks.

User can switch the legend from numbers to grid coordinates aided by horizontal and vertical grid markers:

Interactive map of Edgar Allan Poe's Philadelphia using a grid of coordinates.

There are also layers highlighting items such as his home and work locations:

Interactive map of Edgar Allan Poe's Philadelphia showing Poe's home and work locations.

New York, NY

The largest and most detailed, New York requires full dimensions of 10,032 x 3221 to read the street names and locations. Even this 2508 x 805 version doesn’t allow you to see street names clearly. There’s just so much Poe in New York. This version still doesn’t include “the High Bridge” north of Harlem, Blackwell’s Island, or Queens (which is more for Susan Archer Talley Weiss–a Poe supporting character but the main protagonist of a different project).

Interactive map of Edgar Allan Poe's New York in progress.

Because he spent not one but two periods of his life there over several “sub-periods,” this map will also include timelines which I’m very excited about.

Boston, MA

Building Boston was more difficult because the map I started with (not from The Man), only showed a small portion of Boston and lacked many key Poe landmarks. So, in Photoshop, I layered it with another couple maps from closer to Poe’s time as well as a Google Map.

Creating an interactive map of Edgar Allan Poe's Boston.

After completing the boundaries of the … is that a peninsula? … and surrounding land (I couldn’t neglect to include the nearby military base given it’s connection to Poe), I had to remove irrelevant or inaccurate content.

Trimming modern elements from an interactive map of Edgar Allan Poe's Boston.

Here it is, closer to completion with all the landmarks I’d gathered listed at the top.

In progress interactive map of Edgar Allan Poe's Boston.