emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] Org Mode and PDF Notes!


From: Matt Lundin
Subject: Re: [O] Org Mode and PDF Notes!
Date: Thu, 12 Nov 2015 08:28:44 -0600
User-agent: Gnus/5.130014 (Ma Gnus v0.14) Emacs/25.0.50 (gnu/linux)

Ramon Diaz-Uriarte <address@hidden> writes:

>
> so we get the location of the highlight (and its properties), but not the
> textual contents. And this is the case whether I make the annotation with
> EzPDF or Okular or, for that matter, with pdf-tools itself.
>
> So it seems RepliGO is actually giving you a lot more by default :-)
>
>>
>> Politza and I are discussing this here:
>> https://github.com/politza/pdf-tools/issues/137
>>
>> that might be a good place to ocntinue the conversation.
>>
>
> I'll do. In the meantime, I think this is a limitation coming from
> poppler. Other people have mentioned similar things (e.g.,
> http://coda.caseykuhlman.com/entries/2014/pdf-extract.html) and using other
> tools that depend on poppler (such as Leela:
> https://github.com/TrilbyWhite/Leela) also will not give us the text
> itself. 

I don't think this is a limitation of poppler so much as the way that
pdf annotations work. Typically, the subject/text field is not populated
by the text of the highlighted region. Rather, a highlight annotation
specifies bounds, color, style, etc. Basically what Repligo does (I
wouldn't recommend using it, as it is closed source and severely out of
date) is to grab the text *at the time of highlighting* and add it to
the notes field. I don't know of any other annotation tool that does the
same thing. Applications built on poppler could do it, though they
currently do not.

For extracting the text of highlighted regions *after the fact*, I've
had good luck with this script that relies on the pdf-reader gem for
ruby:

https://gist.github.com/danlucraft/5277732

Matt



reply via email to

[Prev in Thread] Current Thread [Next in Thread]