Monday, May 19, 2014

Redistilling PDFs that are not portable by design

I hate it when I am forced to deal with documents that are portable in title only (yes, I am looking at your Adobe). Every so often, I do get pdf documents from a major organisation that can viewed by Adobe Acrobat only. On OSX, this bloated application consumes 369 Megabytes of precious SSD space (preview consumes 29 Megabytes and is nicer).

Anyway, back to the story, these documents cannot be saved in any other format on my machine. In fact, the only way to read these documents w/out hackery is to print them out and rescan them back.

!Stupid!

So here goes a recipe for saving these files in a portable way.
############ Adobe badness #############
# In your operating system, create a postscript printer whose address is 127.0.0.1
# Fake a postscript printer using netcat
$ nc -l 127.0.0.1 9100 > printout.ps
# Print your pdf using Adobe Reader to the postscript printer on 127.0.0.1
# Netcat will diligently dump the printout to printout.ps as a postscript file
# The postscript file is encrypted and can't be converted by Ghostscript utils
$ ps2pdf printout.ps printout.pdf
This PostScript file was created from an encrypted PDF file.
Redistilling encrypted PDF is not permitted.
Error: /undefined in --eexec--
Operand stack:
--nostringval-- --dict:94/200(L)-- quit
Execution stack:
%interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push 1894 1 3 %oparray_pop 1893 1 3 %oparray_pop 1877 1 3 %oparray_pop 1771 1 3 %oparray_pop --nostringval-- %errorexec_pop .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- 1762 2 3 %oparray_pop --nostringval-- --nostringval-- --nostringval--
Dictionary stack:
--dict:1163/1684(ro)(G)-- --dict:1/20(G)-- --dict:94/200(L)-- --dict:1163/1684(ro)(G)--
Current allocation mode is local
Last OS error: No such file or directory
GPL Ghostscript 9.07: Unrecoverable error, exit code 1
############ Portable goodness #############
# Fake a postscript printer using netcat
$ nc -l 127.0.0.1 9100 > printout2.ps
# Print your pdf using Adobe Reader to the postscript printer on 127.0.0.1
# Yank out adobe file protection gunk from the postscript file generated by netcat
$ sed -e "/mark currentfile eexec/,/cleartomark/ d" printout2.ps > printout_clean2.ps
# Convert away!
$ ps2pdf printout_clean2.ps printout_clean2.pdf
# Enjoy your newly found portability!

1 comment: