[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freefont-bugs] New glyphs are unixxxx while other glyphs are uniXXX
From: |
James Cloos |
Subject: |
Re: [Freefont-bugs] New glyphs are unixxxx while other glyphs are uniXXXX. |
Date: |
Sat, 06 Jul 2013 13:30:48 -0400 |
User-agent: |
Gnus/5.130008 (Ma Gnus v0.8) Emacs/24.3.50 (gnu/linux) |
>>>>> "SW" == Steve White <address@hidden> writes:
SW> So far, the results are erratic, but
SW> * nothing created on Linux copies Hindi text correctly from PDF files.
The proper solution for text extraction from PDF files, especially for
complex and/or r2l scripts, is for the PDF creator to include ActualText
objects in the PDF.
Nothing else can work for all scripts.
Cf §10.8.3 Replacement Text in PDFReference17.pdf; the same is §14.9.4
in PDF32000_2008.pdf.
-JimC
--
James Cloos <address@hidden> OpenPGP: 1024D/ED7DAEA6