groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] PDFPIC macro


From: Keith Marshall
Subject: Re: [Groff] PDFPIC macro
Date: Mon, 9 Oct 2017 09:10:18 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0

Hi Deri,

Thanks for trying it out.

On 09/10/17 01:21, Deri James wrote:
> Some pdfs I have tried fail with "syntax error".

That's yacc's default behaviour, when the sequence of tokens returned 
by the lexer doesn't conform to its notion of a valid grammar -- either 
the order isn't as expected, or the sequence is incomplete.

> It seems to occur if MediaBox is defined in an ancestor object rather
> than in a "/Page object. There are a number of page attributes which
> are inheritable in this way, MediaBox is one of them.

I do know that, thanks; it is a configuration which I did test, (albeit 
with contrived, hand crafted test files):

  $ ./psbb *.pdf
  inherited.pdf: bounding box = (0,0)..(612,792)
  minimal.pdf: bounding box = (0,0)..(612,792)
  override.pdf: bounding box = (0,0)..(606,809)

> So in case a MediaBox is superseded by an entry further down the tree
> you still have to continue looking till you get to the object for
> page 1, to make sure.

And this is exactly what my code does!  (To be precise, it parses the 
trailer dictionary, to locate the /Catalog object, whence it follows the 
indirect object reference to the top level /Pages object, and thence, it 
follows the chain of the first /Kids references, through as many /Pages 
objects as it may find, until it finds the first /Page object.  In each 
/Pages object it traverses, it evaluates any /MediaBox specifications 
it may find; at each lower level, any such specification overrides any 
which was evaluated at a higher level.  Thus, when the /Page object is 
parsed, the last /MediaBox encountered -- which may be within the /Page 
object itself, or in its nearest /Pages ancestor which specified one -- 
will prevail).

Perhaps, you could:

  $ make clean
  $ make CFLAGS=-DDEBUGGING

and check your failing PDFs again, so we can see whatever unexpected 
token sequence is leading to the "syntax error"; only when we know that, 
will we have any chance of handling it, before the parser simply gives 
up on the offending PDF.

-- 
Regards,
Keith.

Attachment: samples.tar.xz
Description: application/xz


reply via email to

[Prev in Thread] Current Thread [Next in Thread]