groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [groff] Accented Cyrillic characters


From: Ingo Schwarze
Subject: Re: [groff] Accented Cyrillic characters
Date: Thu, 2 Aug 2018 16:14:57 +0200
User-agent: Mutt/1.8.0 (2017-02-23)

Hi Robin,

Robin Haberkorn wrote on Thu, Aug 02, 2018 at 07:47:35PM +0600:

> But for the rest of glyphs, it should IMHO a) make sure that
> accentuation glyphs have a zero-width

There appears to be specific code in groff to explicitly *BREAK*
the return value of wcwidth(3).  Actually, egregious mishandling
of wcwidth(3) is a quite common error in application programs, so
groff is certainly not alone here.

> (Sorry, I'm not that motivated to seriously debug this in the Groff
> sources.  Just hoped that somebody would already know what's going
> on here.)

I'm not familiar with groff internals either (except for the manual
page macroset implementations), but i had a quick look and instantly
identified at least three places where wcwidth(3) handling is obviously
broken, see the patch below.  That patch is *NOT* intended for commit,
but merely for giving others some hints in which areas to look.

On the one hand, it doesn't appear to help yet, there seems to be
yet more breakage elsewhere.  On the other hand, i have no idea
whether these changes would have unintended side effects.  It is
quite likely that the details must be slightly different than my
first draft patch.  But so much is certain, it is wrong to treat
the return values 0 and -1 from wcwidth(3) identically.  That can
almost never be right.

The way wcwidth(3) is mishandled makes it obvious that fixing it
will not be completely trivial.

In the meantime, until groff gets fixed, as a workaround, you can
use mandoc(1) to view your manual pages on the terminal (mandoc.bsd.lv),
which does handle the width of accented cyrillic characters correctly
inside table columns.

Yours,
  Ingo


 ----- 8< ----- schnipp ----- >8 ----- 8< ----- schnapp ----- >8 -----


 $ cat tmp3.man
.TH TEST 1
.SH DESCRIPTION
.TS
box;
l.
саморазруше\[u0301]ние
foo bar
.TE

 $ LC_CTYPE=C.UTF-8 mandoc tmp3.man
TEST(1)                     General Commands Manual                    TEST(1)



DDEESSCCRRIIPPTTIIOONN

       +---------------+
       |саморазруше́ние |
       |foo bar        |
       +---------------+


                                                                       TEST(1)


 ----- 8< ----- schnipp ----- >8 ----- 8< ----- schnapp ----- >8 -----


diff --git a/src/libs/libgroff/font.cpp b/src/libs/libgroff/font.cpp
index 17e6f425..08f29bca 100644
--- a/src/libs/libgroff/font.cpp
+++ b/src/libs/libgroff/font.cpp
@@ -384,6 +384,8 @@ int font::get_width(glyph *g, int point_size)
     // Unicode font
     int width = 24; // XXX: Add a request to override this.
     int w = wcwidth(get_code(g));
+    if (w == 0)
+      return 0;
     if (w > 1)
       width *= w;
     if (real_size == unitwidth || font::unscaled_charwidths)
@@ -962,7 +964,7 @@ int font::load(int *not_found, int head_only)
            }
            if (is_unicode) {
              int w = wcwidth(metric.code);
-             if (w > 1)
+             if (w >= 0)
                metric.width *= w;
            }
            p = strtok(0, WS);
diff --git a/src/roff/troff/node.cpp b/src/roff/troff/node.cpp
index 27311b1c..a1ffd394 100644
--- a/src/roff/troff/node.cpp
+++ b/src/roff/troff/node.cpp
@@ -509,6 +509,8 @@ tfont_spec tfont_spec::plain()
 
 hunits tfont::get_width(charinfo *c)
 {
+  if (fm->get_width(c->as_glyph(), size.to_scaled_points()) == 0)
+    return 0;
   if (is_constant_spaced)
     return constant_space_width;
   else if (is_bold)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]