qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] chardev/char-pty: Avoid losing bytes when the other side jus


From: Thomas Huth
Subject: Re: [PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Date: Mon, 28 Aug 2023 14:23:45 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 17/08/2023 19.09, Thomas Huth wrote:
On 17/08/2023 15.47, Marc-André Lureau wrote:
Hi

On Thu, Aug 17, 2023 at 5:06 PM Daniel P. Berrangé <berrange@redhat.com> wrote:

On Thu, Aug 17, 2023 at 02:00:26PM +0200, Thomas Huth wrote:
On 17/08/2023 12.32, Daniel P. Berrangé wrote:
On Wed, Aug 16, 2023 at 11:07:43PM +0200, Thomas Huth wrote:
When starting a guest via libvirt with "virsh start --console ...",
the first second of the console output is missing. This is especially
annoying on s390x that only has a text console by default and no graphical
output - if the bios fails to boot here, the information about what went
wrong is completely lost.

One part of the problem (there is also some things to be done on the
libvirt side) is that QEMU only checks with a 1 second timer whether
the other side of the pty is already connected, so the first second of
the console output is always lost.

This likely used to work better in the past, since the code once checked
for a re-connection during write, but this has been removed in commit
f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
some locking.

To ease the situation here at least a little bit, let's check with g_poll()
whether we could send out the data anyway, even if the connection has not
been marked as "connected" yet. The file descriptor is marked as non-blocking anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
not cause any trouble if the other side is not ready for receiving yet.

With this patch applied, I can now successfully see the bios output of
a s390x guest when running it with "virsh start --console" (with a patched
version of virsh that fixes the remaining issues there, too).

Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
   chardev/char-pty.c | 22 +++++++++++++++++++---
   1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/chardev/char-pty.c b/chardev/char-pty.c
index 4e5deac18a..fad12dfef3 100644
--- a/chardev/char-pty.c
+++ b/chardev/char-pty.c
@@ -106,11 +106,27 @@ static void pty_chr_update_read_handler(Chardev *chr)    static int char_pty_chr_write(Chardev *chr, const uint8_t *buf, int len)
   {
       PtyChardev *s = PTY_CHARDEV(chr);
+    GPollFD pfd;
+    int rc;
-    if (!s->connected) {
-        return len;
+    if (s->connected) {
+        return io_channel_send(s->ioc, buf, len);
       }
-    return io_channel_send(s->ioc, buf, len);
+
+    /*
+     * The other side might already be re-connected, but the timer might
+     * not have fired yet. So let's check here whether we can write again:
+     */
+    pfd.fd = QIO_CHANNEL_FILE(s->ioc)->fd;
+    pfd.events = G_IO_OUT;
+    pfd.revents = 0;
+    rc = RETRY_ON_EINTR(g_poll(&pfd, 1, 0));
+    g_assert(rc >= 0);
+    if (!(pfd.revents & G_IO_HUP) && (pfd.revents & G_IO_OUT)) {

Should (can?) we call

     pty_chr_state(chr, 1);

here ?

As far as I understood commit f8278c7d74c6 and f7ea2038bea04628, this is not
possible anymore since the lock has been removed.

+        io_channel_send(s->ioc, buf, len);

As it feels a little dirty to be sending data before setting the
'connected == 1' and thus issuing the 'CHR_EVENT_OPENED' event

I didn't find a really better solution so far. We could maybe introduce a
buffer in the char-pty code and store the last second of guest output, but
IMHO that's way more complex and thus somewhat ugly, too?

The orignal commit f8278c7d74c6 said

[quote]
     char-pty: remove the check for connection on write

     This doesn't help much compared to the 1 second poll PTY
     timer. I can't think of a use case where this would help.
[/quote]

We've now identified a use case where it is actually important.

IOW, there's a justification to revert both f7ea2038bea04628 and
f8278c7d74c6, re-adding the locking and write update logic.

Indeed. But isn't it possible to watch for IO_OUT and get rid of the timer?

It might be possible - Marc Hartmayer just sent me a draft patch today that uses qio_channel_add_watch() and gets rid of the timer ... I'll do some experiments with that and send it out if it works reliably.

I did quite a bunch of experiments with the code, but as far as I can see, it's not so straight forward: IO_OUT keeps being enabled once IO_HUP occurs since the kernel can still buffer some output before it de-asserts IO_OUT. So there is no easy way to check for a re-connection without some kind of regular polling, as far as I can see...

Well, maybe something like this might be possible: Add a watch for IO_HUP. Once HUP occurs, don't signal CHR_EVENT_CLOSED yet, but set a flag variable and in case the guest still writes something to the chardev and the flag is true, we write the bytes to the pty one by one (to avoid locking in case the buffer gets full) and check for IO_OUT via g_poll() after each write. Once we see IO_OUT being gone, we can signal CHR_EVENT_CLOSED and add a watch for IO_OUT being enabled again. Once that watch triggers, we know that the other side re-connected and we can signal CHR_EVENT_OPENED and continue normally. ... sounds all quite complicated, is it worth the effort? What do you think?

 Thomas




reply via email to

[Prev in Thread] Current Thread [Next in Thread]