[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] typesetting Czech with custom fonts
From: |
Deri James |
Subject: |
Re: [Groff] typesetting Czech with custom fonts |
Date: |
Wed, 28 Mar 2012 21:30:32 +0100 |
User-agent: |
KMail/1.13.7 (Linux/2.6.38.8-desktop-10.mga; KDE/4.6.5; x86_64; ; ) |
On Wednesday 28 Mar 2012 16:02:01 Werner LEMBERG wrote:
> > This is a (painful) limitation of Adobe's pdfmark specification:
> > only a rather limited set of characters is permitted within the text
> > which is specified to describe a bookmark.
>
> This is not correct, AFAIK. There are two encodings for pdfbookmarks,
> namely PDFDocEncoding and Unicode. So it should certainly be possible
> to use Czech characters, but apparently groff's pdfmark package
> doesn't support Unicode bookmarks.
>
> Deri, what about gropdf?
>
>
> Werner
Hi Werner,
You are correct that full UTF-16 is supported for annotations, the problem is
that by the time the string is passed to pdfbookmark the characters
have been changed to named glyph nodes which I believe can't be converted back
to their UTF-16 character code (i.e. \[u0159]) within a macro, so
I'm in the same boat as Keith. In order to do this I think we'd need help from
troff, something like .asciify16hex which would return the string as a
BOM followed by the two byte unicode for each character, i.e. 00 41 01 59 (A
rcarron) ... this could then be passed onto the pdf enclosed in '<>'
with a BOM on the front instead of enclosing the text in '()'. Even being able
to reconstitute \[u0159] would be helpful for gropdf, since it could then
build the hex string itself.
I've been looking into .asciify in a bit more detail (in preparation for the
documention patch you asked for). Please can you confirm I've got this
correct:-
Node Action
==== ========================
line_start_node deleted
space_node If was_escape_colon return ESCAPE_COLON else
return node
word_space_node return space(s)
unbreakable_space_node return ESCAPE_TILDE
diverted_space_node Ignored
diverted_copy_file_node Ignored
extra_size_node Ignored
vertical_size_node deleted
hmotion_node If was_tab return tab else return node
space_char_hmotion_node return ESCAPE_SPACE
vmotion_node Ignored
hline_node Ignored
vline_node Ignored
zero_width_node Ignored
left_italic_corrected_node deleted
overstrike_node Ignored
bracket_node Ignored
draw_node Ignored
glyph_node If asciify_code or ascii_code not 0 return
chr() else return node.
ligature_node deleted
kern_pair_node deleted
dbreak_node deleted
italic_corrected_node deleted
My c++ foo is not strong but I suspect the nodes marked as ignored (which have
no specific asciify method) inherit the generic node method which
is to return the node.
It can be seen from the above that in several cases the asciified
string/diversion will still hold nodes as well as ascii characters.
Does this look correct Werner?
As regards gropdf handling the czech example given, that seems to work
perfectly with fonts which contain the needed characters, although I did
fix a problem in this area quite recently so I owe you a patch for this.
Cheers
Deri
- Re: [Groff] typesetting Czech with custom fonts, (continued)
- Re: [Groff] typesetting Czech with custom fonts, Deri James, 2012/03/28
- Re: [Groff] typesetting Czech with custom fonts, Petr Man, 2012/03/28
- Re: [Groff] typesetting Czech with custom fonts, Werner LEMBERG, 2012/03/28
- Re: [Groff] typesetting Czech with custom fonts, Werner LEMBERG, 2012/03/28
- Re: [Groff] typesetting Czech with custom fonts, Petr Man, 2012/03/28
- Re: [Groff] typesetting Czech with custom fonts, Werner LEMBERG, 2012/03/29
- Re: [Groff] typesetting Czech with custom fonts, Peter Schaffter, 2012/03/29
- Re: [Groff] typesetting Czech with custom fonts, Petr Man, 2012/03/29
- Re: [Groff] typesetting Czech with custom fonts, Keith Marshall, 2012/03/28
Re: [Groff] typesetting Czech with custom fonts, Werner LEMBERG, 2012/03/28