groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [groff] hyphen, minus sign and hyphen-minus


From: Pali Rohár
Subject: Re: [groff] hyphen, minus sign and hyphen-minus
Date: Mon, 28 May 2018 15:24:05 +0200
User-agent: NeoMutt/20170113 (1.7.2)

On Monday 28 May 2018 15:16:53 Pali Rohár wrote:
> On Monday 28 May 2018 02:48:09 Ingo Schwarze wrote:
> > Hi Pali,
> > 
> > Pali Rohar wrote on Sun, May 27, 2018 at 11:52:44PM +0200:
> > 
> > > Now I looked deeply at man -Tps output and basically \- sequence is
> > > written as character 0xAD (\255 in octet) into output postscript file.
> > > Therefore it is SOFT HYPHEN (U+00AD),
> > 
> > No, that is not a "soft hyphen".  Glyph numbers in fonts used for
> > PostScript output have nothing to do with Unicode code points.
> > Look at the file font/devps/TR for examples:
> > 
> > PS name      TR#   Unicode
> > -------      ---   -------
> > asciicircum  0x00  U+005E
> > asciitilde   0x01  U+007E
> > Scaron       0x02  U+0053 U+030C
> > Zcaron       0x03  U+005A U+030C
> > scaron       0x04  U+0073 U+030C
> > zcaron       0x05  U+007A U+030C
> > Ydieresis    0x06  U+0059 U+0308
> > trademark    0x07  U+2122
> > quotesingle  0x08  U+0027
> > Euro         0x09  U+20AC
> > hyphen       0x2d  U+2010
> > circumflex   0x5e  U+02C6
> > quoteleft    0x60  U+2018
> > tilde        0x7e  U+02DC
> > bullet       0x83  U+2022
> > florin       0x84  U+0192
> > minus        0xad  U+2212
> > 
> > and so on and so forth, it's completely different all over the place.
> 
> I'm saying that I generated PostScript file via man -Tps and then looked
> into generated PostScript file.
> 
> And in PostScript file on place where should command line switch
> --something was F2(\255... or F2<ad... \255 is IIRC glyph encoded in
> octets and <ad> in hex. 0255 and 0xAD are both decimal 173, so both
> refers to same glyph.
> 
> Now I see that in that PostScript file is also attached encoding vector
> def /ENC0 [ ... ] and on position 173 is name /minus. And according to
> Adobe /minus name represent Unicode code points U+2212.
> 
> So you are right it is not soft-hyphen, I forgot to see at encoding
> vector in result PostScript file.
> 
> And also answer my question why ps2pdf converter from generates PDF file
> where for switches are used U+2212 code points. ps2pdf did it correctly
> by looking into attached encoding vector /ENC0.
> 
> So problem is for sure in grodvi which generates that PS file with
                            ~~~~~~
                   I mean   grops

> attached encoding vector. Unicode's hyphe-minus has code point U+002D
> and according to Adobe's glyphlist.txt, U+002D is assigned to glyph name
> /hyphen.
> 
> So man -Tps (or grodvi) can be fixed. Just it is needed to generate
> correct encoding vector and use proper glyph name /hyphen for \- when
> generating from manpage.

Here is simple fix results to have hyphen-minus (U+002D) for command
line switches in postscript output via man -Tps:

man -Tps groff | sed 's:/minus:/hyphen:g' > groff.ps

But on some places it damage formatting due to different font metrics.
Therefore this replacing should be fixed in groff postscript generator.

> > > so it is incorrect for command line switch.
> > 
> > It is not incorrect.  The TR font does not contain a glyph for
> > hyphen-minus, so plain minus is used as a fallback.
> 
> In font/devps/TR file is this line in "charset" section:
> 
> \-    564,286 0       173     minus
> 
> Should not this be number 45 instead of 173?
> 

-- 
Pali Rohár
address@hidden



reply via email to

[Prev in Thread] Current Thread [Next in Thread]