groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Heirloom] Using the Symbola font in Heirloom troff


From: Richard Morse
Subject: Re: [Heirloom] Using the Symbola font in Heirloom troff
Date: Wed, 5 Aug 2020 08:55:48 -0400

Hi! The issue arises before it even gets to the PostScript.

If you run the following commands:

        .do xflag 3
        .lc_ctype UTF-8
        .fp 5 Symbola Symbola ttf
        .ft Symbola
        ❊ works
        .sp
        🂡 char
        .sp
        \U'1F0A1' uesc
        .sp
        \[u1F0A1] name
        .sp


Through Heirloom as `troff test.roff | less`, you can see that the output is 
(in part, once the heading is all set up):

        H72000
        V12000
        CPSspoked8teardroppropellerstar
        wh11510cw
        h7670co
        h5140cr
        h4010ck
        h5560cs
        n12000 0
        H72000
        V36000
        h6660cc
        h4490ch
        h5760ca
        h5220cr
        n12000 0
        H72000
        V60000
        h6660cu
        h5760ce
        h4550cs
        h3920cc
        n12000 0
        H72000
        V84000
        CPSu1F0A1
        wh11270cn
        h5760ca
        h5220cm
        h8660ce
        n12000 0

You’ll notice that the star character, which works in the PDF, and the named 
character (remember that, inside the font file, u1F0A1 is the character name) 
both show up in ‘CPS’ statements. But the two other places you would expect to 
see something (from the actual character and the \U escape), it is entirely 
missing. You have the ‘H72000’ command, the ‘V’ command (with the vertical 
offset), and then it goes immediately into the latin text (seemingly without 
even including the space that should exist?).

So for whatever reason, it isn’t seeing the character as something that should 
be output.

Ricky

> On Aug 5, 2020, at 1:30 AM, T. Kurt Bond <tkurtbond@gmail.com> wrote:
> 
> Looking at the postscript output there is a "/uni1F0A1 9429 def" and a 
> "/uni1F10A" in a "/Encoding-@15@36 [...] def"; is that part of the font 
> machinery?  (I'm sadly ignorant of PostScript, alas.)
> 
> Looking at troff/troff.d/otf.c I see that there is a struct WGL that contains 
> female and male entries.  At the beginning of the struct is a comment that 
> consists of "/* WGL4 */".  Googling that led to Windows Glyph List 4.  Taking 
> a leap, I added the unicode characters FEMALE SIGN and MALE SIGN to my test 
> document.  Those show up fine in the final PDF output.  Maybe this is 
> connected?  At this point I suspect without much evidence that characters 
> that are not in the StandardStrings array, the MacintoshStrings array, or the 
> WGL array don't get output.  Maybe.  I'll have to investigate some more.  
> 
> On Tue, Aug 4, 2020 at 11:10 PM Richard Morse <pukku@mac.com> wrote:
> Hm. Just for my edification, I tried a few things.
> 
> I’m on a Mac, and I don’t know when I compiled Heirloom troff, but it was a 
> year or two ago, so something things may be different.
> 
> I downloaded the Symbola font from fontlibrary.org. The version I got was 
> .ttf, not .otf.
> 
> The various things that you tried did not work for me either. \[u1F0A1] did 
> work, but that’s because (according to fret, at least), that’s the font’s 
> internal name for the symbol, which is not guaranteed to be true across all 
> fonts, so you can’t really use that for a “fallback” system.
> 
> Looking at the output of troff without going through dpost, it looks like it 
> is completely ignoring the character. I tried explicitly setting LC_CTYPE to 
> ‘en_US.UTF-8’ and ‘UTF-8’ (both in the terminal, and using the .lc_ctype 
> command), but that had no effect.
> 
> I wonder if troff has a compiled in list of unicode characters that it 
> understands, and if you try to use one it deems invalid it just ignores it? 
> (This may be borne out by 
> https://github.com/n-t-roff/heirloom-doctools/blob/master/troff/troff.d/unimap.c
>  , but I don’t really know enough about the code to be certain.)
> 
> Ricky
> 
> > On Aug 4, 2020, at 10:14 PM, T. Kurt Bond <tkurtbond@gmail.com> wrote:
> > 
> > In Emacs M-x describe-coding-system tells me the coding system for saving 
> > the buffer is utf-8-unix.  I don't have any LC_* environment variables set, 
> > but LANG=en_US.UTF-8.
> > 
> > I'm not very knowledgeable about the insides of Unicode fonts, 
> > unfortunately.  
> > 
> > On Tue, Aug 4, 2020 at 4:27 PM Richard Morse <pukku@mac.com> wrote:
> > Huh. I’m afraid I’m out of my depth then; you might check and see if your 
> > LC_* environment variables are set to something incompatible with utf-8 
> > (or, maybe, check and make sure the file in UTF-8, not UCS-16 or something 
> > if you’re on Windows), but hopefully someone with more experience and 
> > knowledge will speak up…
> > 
> > Ricky
> > 
> > > On Aug 4, 2020, at 3:59 PM, T. Kurt Bond <tkurtbond@gmail.com> wrote:
> > > 
> > > And if I add "and explicit unicode character reference \U'1F0A1'" to the
> > > file, that character doesn't show up either.
> > > 
> > > On Tue, Aug 4, 2020 at 2:47 PM Richard Morse <pukku@mac.com> wrote:
> > > 
> > >> According to the Heirloom Troff manual, I think that you cannot just
> > >> insert Unicode characters (although maybe if your LC* environment 
> > >> variables
> > >> are set correctly, you can?). It says:
> > >> 
> > >>> Both nroff and troff allow references to specific Unicode characters
> > >> with the \U'X' escape sequence;
> > >>> it causes the character at position U+X to be printed (X is a
> > >> hexadecimal number). For troff,
> > >>> it is required that this character is available in one of the fonts
> > >> mounted at this point.
> > >>> As an example, \U'20AC' prints the Euro character €. When register .g is
> > >> set to 1 Unicode
> > >>> characters can also be accessed with \[uXXXX] where XXXX is a four digit
> > >> hexadecimal number.
> > >> 
> > >> So I think you would need to use `\U'1F0A1'` for the character to show 
> > >> up?
> > >> 
> > >> Ricky
> > >> 
> > >> 
> > >>> On Aug 4, 2020, at 12:28 PM, T. Kurt Bond <tkurtbond@gmail.com> wrote:
> > >>> 
> > >>> (The heirloom-doctools README.md
> > >>> <https://github.com/n-t-roff/heirloom-doctools/blob/master/README.md>
> > >> says
> > >>> to ask Heirloom doctools questions on this list.)
> > >>> 
> > >>> I'd like to use the Symbola font in Heirloom troff.   I tried the
> > >> following:
> > >>> 
> > >>> .do xflag 3
> > >>> .\" fp 5 Optima Optima-Regular ttf
> > >>> .fp 5 Symbola Symbola otf
> > >>> .LP
> > >>> Here is some normal text.
> > >>> .\" PLAYING CARD ACE OF SPACES is Unicode 0x1F0A1
> > >>> .ft Symbola
> > >>> 🂡 And some normal text. ❊
> > >>> .ft P
> > >>> More normal text.
> > >>> 
> > >>> That's a literal PLAYING CARD ACE OF SPADES Unicode character at the
> > >> start
> > >>> of the line between the two .ft requests.  That character does not show
> > >> up
> > >>> in the troff output, even through the EIGHT TEARDROP-SPOKED PROPELLER
> > >>> ASTERISK Unicode character at the end of the line *does* show up,
> > >>> as CPSuni274A where the CPS<name> outputs the character of that name.
> > >> The
> > >>> Symbola font is embedded in the PDF output (created from the PostScript
> > >>> output), and the text "And some normal text" and the EIGHT
> > >> TEARDROP-SPOKED
> > >>> PROPELLER ASTERISK Unicode character are in the Symbola font in the 
> > >>> troff
> > >>> output.
> > >>> 
> > >>> However, if I manually add a CPSuni1F0A1 to the troff output, *that*
> > >> character
> > >>> *does* show up.
> > >>> 
> > >>> Any ideas as to why the literal PLAYING CARD ACE OF SPADES Unicode
> > >>> character in the document source is being ignored and not written to the
> > >>> troff output?
> > >>> 
> > >>> I actually have a document that needs to use the PLAYING CARD ACE OF
> > >> SPADES
> > >>> Unicode character.  The ultimate goal is to have the Symbola font used
> > >> as a
> > >>> fallback font, which should happen automatically in Heirloom troff, 
> > >>> since
> > >>> it searches all the fonts when a font is missing a character, but I made
> > >>> the example use the Symbola font directly because that shows the problem
> > >>> directly.
> > >>> 
> > >>> --
> > >>> T. Kurt Bond, tkurtbond@gmail.com, https://tkurtbond.github.io
> > >> 
> > >> 
> > > 
> > > -- 
> > > T. Kurt Bond, tkurtbond@gmail.com, https://tkurtbond.github.io
> > 
> > 
> > 
> > -- 
> > T. Kurt Bond, tkurtbond@gmail.com, https://tkurtbond.github.io
> 
> 
> 
> -- 
> T. Kurt Bond, tkurtbond@gmail.com, https://tkurtbond.github.io




reply via email to

[Prev in Thread] Current Thread [Next in Thread]