ApacheCon NA 2010 Session

Apache PDFBox - Working with pdfs for dummies

A short introduction depicts the history of Apache PDFBox by describing who started why with developing his own java library to work with pdfs. The command line tools provided with Apache PDFBox are a good starting point to get in touch with the library. One of the most favorite features is the extraction of text from a pdf. Use the split, merge or overlay tool to rearrange your documents. Furthermore there are a lot of other interesting tools, e.g. to extract images or metadata and of course a simple viewer to render pdfs. But the real power of Apache PDFBox are the advanced features. Create your own pdfs from scratch using a few lines of java code. Organize your pdfs with an index including thumbnails of the first page of each pdf. Learn more about the small but helpful tools for advance pdfbox users, e.g. the PDFDebugger and WriteDecodedDoc to get a readable version of your compressed pdf. Finally a glimpse into the future shows planned features of future releases and other possible fields of application.