Hi, Michael Weghorn wrote:
On 03/03/2020 12.26, Pietro Paolini wrote:I wanted to have a look at the source code to see if there is some sort of PDF "model" being built from the original PDF document, for instance a set of objects each describing the graphic meanings of a particular region within the page.At a quick glance, 'sdext/source/pdfimport' looks like a good place to start with; I personally don't know more related to your more specific question.
Yep, that's the place - we currently use poppler to parse the PDF, then generate a tree of quite basic drawing operations from it. Check sdext/source/pdfimport/tree/genericelements.cxx for the type of objects in that tree, and sdext/source/pdfimport/tree/{draw|writer}treevisiting.cxx for a visitor-pattern kind of tree walking - for your need, you could e.g. check the object boundaries for each visited object, to check if they intersect with your region of interest. Cheers, -- Thorsten
Attachment:
signature.asc
Description: PGP signature