[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[screen-devel] [bug #51890] screen randomly injects \b into UTF8 streams
From: |
Mike Frysinger |
Subject: |
[screen-devel] [bug #51890] screen randomly injects \b into UTF8 streams when processing combining characters |
Date: |
Tue, 29 Aug 2017 17:50:39 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (X11; CrOS x86_64 9869.0.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3193.0 Safari/537.36 |
URL:
<http://savannah.gnu.org/bugs/?51890>
Summary: screen randomly injects \b into UTF8 streams when
processing combining characters
Project: GNU Screen
Submitted by: vapier
Submitted on: Tue 29 Aug 2017 09:50:37 PM UTC
Category: None
Severity: 3 - Normal
Priority: 5 - Normal
Status: None
Privacy: Public
Assigned to: None
Open/Closed: Open
Discussion Lock: Any
Release: 4.5.0
Fixed Release: None
Planned Release: None
Work Required: None
_______________________________________________________
Details:
simple example:
printf 'xA\U0000030Ax\n'
that will write out the UTF-8 byte stream (hexdump view):
78 41 cc 8a 78 0a |xA..x.|
when i'm not using screen, the terminal emulator sees that exactly. however,
screen will read that and then mangle it, passing along:
xA\bA\U0000030Ax\n
78 41 08 41 cc 8a 78 0a |xA.A..x.|
a proper terminal emulator is able to deal with this. but the question still
stands: why is it doing this ? i couldn't locate the logic in the screen
source though.
poking it through strace shows the screen process doing the read() on the pty
master (/dev/ptmx) and getting the correct UTF-8 stream, then doing a write on
its slave pty with the mangled stream. so it doesn't seem like it's an
external-to-screen mangling.
my locale is set to en_US.UTF8, screen was launched with -U, and .screenrc
has:
defutf8 on
defencoding utf8
using screen 4.05.00
since the whole pipeline is UTF-8 aware, i can't explain why screen would need
to interject these things. i might understand if it was dealing with some
semi-broken systems where it tried to get slightly better output, but that
doesn't apply here.
the odd thing is that when screen dumps lines from its history (e.g. when you
attach or otherwise scrollback), it doesn't inject the \b logic. only for new
content.
noticed originally with a bit more pathological line:
1.001.01a अ॒ग्निमी॑ळे पु॒रोहि॑तं
य॒ज्ञस्य॑ दे॒वमृ॒त्विज॑म् ।
that inserts a number of \b (all around combining chars? didn't look super
close).
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?51890>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/
- [screen-devel] [bug #51890] screen randomly injects \b into UTF8 streams when processing combining characters,
Mike Frysinger <=