groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] backspace and Unicode in terminals


From: Keith MARSHALL
Subject: Re: [Groff] backspace and Unicode in terminals
Date: Thu, 17 Mar 2005 10:55:54 +0000

Werner Lemberg wrote:
> I'm going to improve grotty, the TTY backend of groff, so that it can
> handle zero-width and double-width characters, as needed for proper
> Unicode support.
>
> Doing so I wonder how is the backspace character (U+0008, \b) handled
> in TTYs?  Is there any documentation for it?

This depends on the physical device, on which the output stream will
be displayed/printed.  Particularly in the case of hardcopy TTY devices,
different manufacturers may have handled it differently, and each would
have provided their own documentation (hopefully).  Most such devices
would simply move the printhead to the left, by a distance equal to the
the (fixed) width of a single character, resulting in two characters
being printed at the same physical position.  However, some older
devices may simply have ignored backspace entirely, resulting in two
characters, which were intended to be overstruck, actually occupying
two physically adjacent character positions.  Alternatively, some
devices -- particularly those in the `line-printer' category -- may
have provided line buffering, and would either treat it as a `rubout',
deleting the preceding character, or they may simply ignore it.

When VDU devices became popular as TTY replacements, different device
manufacturers again adopted their own ideas on behaviour of various
control characters.  I believe that most handled backspace fairly
consistently, in implementing it as a `rubout' -- a conventional VDU
cannot display more than one character at any single position, and
most basic terminal display devices do not provide composed character
capabilities.  Differences in behaviour are particularly noticeable
at the beginning of a line -- some devices will back up to the final
character *position* on the preceding line, (note that TTY devices
have fixed line length; this means column 80 on a standard 80 column
screen, regardless of how many character positions were actually
filled), while some will ignore the backspace in this position;
some offer a configurable choice between the two behaviours.

> Most importantly: If I have a wide character at position p which is
> followed by `\b' (at position p+2), is the final position p again?
> With other words, is the width of `\b' dependent on the width of the
> previous character?  What happens if I have a sequence of `\b'
> characters?  I'm thinking especially of the interaction at the
> beginning of a new line.  Is there a distinction between a user who
> presses the `backspace' key, and a `\b' character in the data stream?

True TTY devices are pretty dumb really, and have fixed line length,
with fixed character width, and generally can be guaranteed to support
only the ASCII character set; not many have wide character support,
and there is little standardisation of behaviour.  The termcap/terminfo
subsystem on *nix provides some rationalisation, so grotty could look
there to determine the characteristics of the particular device which
is in use -- the problem is that this is specific to *nix, and may not
work reliably on other platforms.

Another characteristic of TTY devices, which may be important, is the
full/half duplex capability.  For a full duplex device, the input and
output streams are handled independently; in other words, the keyboard
input is separated from the displayed data, and relayed to the host
computer, *without* affecting the display, and the display is updated
exclusively by the data stream returned from the host.  Thus, there
may be a distinction between a user pressing the backspace key, and
a `\b' in the output stream -- it depends on what the host does when
it sees `\b' in its keyboard input stream: if it echoes it, the TTY
output stream will see it indistinguishably from any other `\b'
originating from program output; if it swallows it, then the display
will never see it.  For a half duplex device, which is probably less
common, it is normal to configure local echo on the TTY, so pressing
the backspace backspace key will cause `\b' to be echoed directly to
the display; its effect is likely to be just the same as a `\b'
character received from the host; (it is really the local echo
feature which is important here, rather than the duplicity of the
connection -- this is what controls whether the display is updated
exclusively by the host's output data stream, or a mixture of data
from the host, with directly interposed keyboard input).

HTH.

Keith.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]