[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#35350: Some compile output still leaks through with --verbosity=1
From: |
Ludovic Courtès |
Subject: |
bug#35350: Some compile output still leaks through with --verbosity=1 |
Date: |
Tue, 23 Apr 2019 12:12:34 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux) |
Hi Mark,
Mark H Weaver <address@hidden> skribis:
> Ludovic Courtès <address@hidden> writes:
>
>> Mark H Weaver <address@hidden> skribis:
>>
>>> Sometimes when compiling a package with --verbosity=1, some parts of the
>>> compile output leak through. For example, see the transcript below.
>>
>> Weird.
>
> FWIW, a few observations, possibly relevant:
>
> (1) Each chunk of leaked output begins with 1 or 2 Unicode Replacement
> characters (U+FFFD). In the transcript I provided, the first leak
> began with 1 replacement char, and all later leaks began with 2.
>
> (2) The replacement characters are immediately followed by
> "@ build-log 30033 4096\n", and that string is also sprinkled
> throughout the leaked output, with approximately ~4060-4070
> characters of leaked output between each occurrence of
> "@ build-log 30033 4096\n".
Indeed. I managed to reproduce it while building modem-manager. I
strace’d ‘guix build’ with an additional ‘pk’¹ to see what happens, and
here’s what leads to the wrong “write(2, "�@ build-log…")” call:
--8<---------------cut here---------------start------------->8---
read(13, "gmlo\0\0\0\0", 8) = 8
read(13, "\27\20\0\0\0\0\0\0", 8) = 8
read(13, "@ build-log 22090 4096\n […] warning: \342\200", 4119) = 4119
read(13, "\0", 1) = 1
write(1, "\n", 1) = 1
write(1, ";;; (write 1008 <> #f 0)\n", 25) = 25
write(1, "\n", 1) = 1
write(1, ";;; (write 985 <> 22090 4096)\n", 30) = 30
write(1, "\n", 1) = 1
write(1, ";;; (write 1008 <> 22090 3111)\n", 31) = 31
write(1, "\n", 1) = 1
write(1, ";;; (write 1008 <> 22090 2103)\n", 31) = 31
write(1, "\n", 1) = 1
write(1, ";;; (write 1008 <> 22090 1095)\n", 31) = 31
write(1, "\n", 1) = 1
write(1, ";;; (write 88 <> 22090 87)\n", 27) = 27
write(2, "\r\33[K\\ 'build' phase", 19) = 19
[…]
write(2, "\r\33[K\\ 'build' phase", 19) = 19
write(1, "\n", 1) = 1
write(1, ";;; (write 1 <> #f 0)\n", 22) = 22
read(13, "gmlo\0\0\0\0", 8) = 8
read(13, "\27\20\0\0\0\0\0\0", 8) = 8
read(13, "@ build-log 22090
4096\n\230g_simple_async_result_take_error\342\200\231 is deprecated[…]", 4119)
= 4119
read(13, "\0", 1) = 1
write(1, "\n", 1) = 1
write(1, ";;; (write 1008 <> #f 0)\n", 25) = 25
write(2, "\357\277\275@ build-log 22090 4096\n", 26) = 26
--8<---------------cut here---------------end--------------->8---
The third read(2) call here ends on a partial UTF-8 sequence for LEFT
SINGLE QUOTATION MARK (we get the first two bytes of a three byte
sequence.)
What happens is that ‘process-stderr’ in (guix store) gets that byte
string from the daemon, passes it through ‘read-maybe-utf8-string’,
which replaces the last two bytes with REPLACEMENT CHARACTER, which is
itself a 3-byte sequence.
Thus, we have this extra byte that’s being inserted. That confuses the
whole machinery since the build log was announced as being 4096-byte
long, and it’s now 4097-byte long.
Internally, ‘build-event-output-port’ keeps the last byte of the
REPLACEMENT CHARACTER sequence in the ‘%fragments’ buffer.
Consequently, the “@ build-log” string that comes next doesn’t start on
a newline, and thus it is considered build output. Since the first byte
does not constitute a valid UTF-8 sequence, another REPLACEMENT
CHARACTER is inserted there when it gets printed.
So ‘build-event-output-port’ is working as expected. The problem is the
first layer of UTF-8 decoding that happens in ‘process-stderr’, in the
‘%stderr-next’ case. We would need to disable it, but only if the build
output port is ‘build-event-output-port’ (i.e., it’s capable of
interpreting “multiplexed build output” correctly.)
Thanks,
Ludo’.
¹ pk:
diff --git a/guix/status.scm b/guix/status.scm
index cbea4151f2..4dcbcb0c1f 100644
--- a/guix/status.scm
+++ b/guix/status.scm
@@ -717,6 +717,7 @@ The second return value is a thunk to retrieve the current
state."
(pointer->bytevector ptr count)))
(define (write! bv offset count)
+ (pk 'write count '<> %build-output-pid %build-output-left)
(if %build-output-pid
(let ((keep (min count %build-output-left)))
(set! %build-output