Workshop: PDF-KungFoo with Ghostscript & Co.

  • Posted on: 31 August 2013
  • By: Kurt Pfeifle
Track: 
Software Development
Day: 
Sunday
Author: 
Kurt Pfeifle (in cooperation with Sven Guckes)
Room: 
Track 3 (right)
Undefined
Paper: 

General Topic

This Workshop deals with some of the Top Ten Problems (order according to the subjective experience of the workshop authors), which can occur when processing or creating PDF files.

Details about the Topic

Amongst the issues which will be considered in the workshop are these:

  • (Mis-)Rendering of fonts on screen or on printed paper
  • (Mis-)Rendering of transparent graphic elements on printed paper
  • Extraction of text areas or of all texts from a PDF
  • Konversion of RGB- or CMYK-based black or gray shades to “real” black/gray
  • Extraction of images from PDF
  • Reducing the file size of a PDF
  • Recognizing scanned pages
  • Scaling of PDF pages
  • Unintentional modifications for embedded images (color space, resolution)
  • Linearization (“web optimization”) of PDFs

The workshop will introduce some more or less known commandline toools, which are essential to analyze and repair problems like the ones listed above:

  • qpdf
  • pdftk
  • pdfinfo
  • pdffonts
  • pdfimages
  • pdfunite
  • pdfwalker
  • pdf-parser.py
  • pdfid.py
  • diverse podofo-Tools
  • mutool
  • origami
  • Ghostscript (including some little known, but very useful commandline parameters)

The workshop lead will not use many slides – instead a high portion of live demos will be shown, which can be used by participants on their own notebooks to replicate a particular problem, its analysis and the solution.

New! New! New! A World Premiere!!!

The following is an experimental feature of the workshop. It goes back to an idea developed by Sven Guckes.

  • The workshop author will still not provide slides to the participants.
  • However, the workshop participants will themselves be active to create the documentation they want.

This will happen by working collaboratively on a live protocol of the workshop proceedings. (From experience it is known that 3 active people can already achieve a nearly complete coverage of what was said and shown.)

All the participants need is a web browser which allows online access to a text “pad”. Everybody will be able to contribute comments, links or questions. The workshop leader will be able to see the collected questions and answer them too. The pad also allows a real time chatting between participants. For an example see: http://guckes.titanpad.com/4

Additionally, we suggest the following:

  • The collaborativly created workshop protocol will be published as an eBook within 30 minutes after conclusing of the workshop.
  • At least one updated version which include corrections, clarifications and improvements by the workshop authors will be made available a few days afterwards.
  • It will be licensed under a Creative Commmons license.
  • The first publication of the eBook will happen via Leanpub.com.
  • Initially there will be PDF, EPUB and MOBI formats.
  • The minimal “purchasing” prices at Leanpub for PDF, EPUB and MOBI will be 0.- $US.
  • Voluntarily people will be able to pay whatever higher amount they want.
  • Every “buyer” – explicitely also these who did pay only 0.- $ ! – is entitled to get updates of the eBook as long as they get released.
  • All proceedings from Leanpub to the authors of the eBook will be forwarded fully (100%) to the Electronic Frontier Foundation (EFF). (Leanpub pays to authors 90% of all royalities minus 50 Cents/book – so if pays 5.- , then4.  −  would go to EFF, if someone pays 10.- $ , then it’s 9.- $ for EFF).
  • Workshop participants agree to these rules by contributing to the pad’s content.
  • All workshop participants who submit their real names or a self-choosen nickname will be listed as co-authors of the book.

Alternatively to supporting the EFF: the organiziers of T-DOSE may name (in accordance with Kurt and Sven) a different organisation (or their own) as a beneficiary of the eBook proceeedings – if this organisation can fullfill the Leanpub preconditions for this (scroll down, “Requirements ...”).

Short bios of the workshop leaders

Kurt Pfeifle:
Workshop leader.

Kurt lives in Stuttgart. 2 years ago, after 27 years with the same employer, he started a new phase in his work live as an IT freelancer.

He earns part of his living by dealing with network printing, data conversion and PDF debugging issues in IT projects. One of his customers nicknames him “the PDF debugger on two legs”.

Kurt is the current „all-time top scorer“ of those StackOverflow users, who are active with the topics of [PDF], [Ghostscript] and [ImageMagick].

Kurt has never been at T-DOSE bevor.

Sven Guckes:
Workshop helper, eBook editor

Sven lives in Berlin and is looking forward to his next participation at T-DOSE. You can learn more about Sven on his personal homepage.

How to get in touch

Kurt:
mail: kurt.pfeifle@gmail.com

mobile: +49-172-715-7017

Sven:
mail: tdose.nl@guckes.net

mobile: +49-179-396-6141

Time: 
14:00 - 16:00 hrs
field_vote: 
0
No votes yet