When we upgraded our student information system (SIS), I was charged with finding a way to retrieve all of the existing historical student transcripts from the old system and put them into a usable format. Using the proprietary menu, I ran several batch transcript exports (by graduation year) and discovered the the system outputs the files in raw PCL format:
I found a tool called GhostPCL to convert these files to PDF. This is a simple enough operation in either Linux or Windows:
Once this is complete, you have a PDF, but no bookmarks. Since a grad year could contain hundreds and hundreds of students, I needed a way to get to a specific record without doing a PDF search. Incidentally, PDF searching worked fine, I just felt that it took too long to perform. I wanted to add some bookmarks.
I did some searching and came up with JPdfBookmarks, which allows you to insert bookmarks from a text file using a specific format. The only thing left to do was to generate the text file containing the links. I took a closer look at the original PCL file and discovered that the position of the student name appeared in the same place on every transcript. This, of course, is by design, and achieved by using the PCL code to insert text at specific coordinates. When I searched for a student name, I determined that the notable coordinates for my specific document were 150×330. Using a simple grep for the PCL code to insert at those coordinates:
grep p150x330 filename.pcl
I’m left with a list of names, in order:
At this point, all I needed to do was pipe the output of this command to a file and scrub the data to look like what JPdfBookmarks expected. The easiest way I could think of to do this was to use Notepad++Â and Microsoft Excel. I piped the output to a file using Linux:
grep p150x330 filename.PCL > namelist.txt
The next step is to open that file in Notepad++ and remove the garbage to the left of the names with a search and replace. Once I had a clean file with just names, I headed over to Microsoft Excel to add the finishing touches. The simplest input file format for JpdfBookMarks requires the bookmark name (which in this case is the same as the student name I already had), a forward slash, and the page number. Since I knew that the names were in the correct order, and that there was one transcript per page, I could simply append the slash and the page number using the CONCATENATE function:
=CONCATENATE(a1,"/",b1)
This assumes column A has the student names and column B has the page numbers. Column B is just an Autofill.
Now that I had the PDF files and the bookmark files, it was just a matter of merging them using the toolbar button in JPdfBookmarks:
Now I have fully bookmarked PDF files, and though I started out with PCL files, I didn’t print a single sheet of paper in the process.