[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Wget-dev] wget2 | Progress reports >100% file size during recursive

From: Darshit Shah
Subject: Re: [Wget-dev] wget2 | Progress reports >100% file size during recursive downloads with HTTP/2 (#339)
Date: Thu, 16 Aug 2018 11:06:58 +0000

Hi Josef. I'm not completely convinced with the proposal. 

Some time back I had a little time on my hands and I looked into this issue 
once again. Unfortunately, I haven't had the time to actually finish the PoC 
solution that I came up with. However, I'll share my full analysis here.

The problem is not that the progress bar computes the progress status based on 
the filesize of the first file. The problem is that a single progress bar slot 
is being used to show the status of multiple files. The core issue here is the 
original assumption I made that each downloader thread is downloading only one 
file at a time. However, this is not true in the case of HTTP/2 downloads where 
`queue_size() > nthreads`. In such a case, we start using the HTTP/2 Stream 
Multiplexing capabilities. Which means that now a single downloader thread is 
indeed downloading multiple files at the same time.

You can see this by simply changing the `--http2-window-size` option to a 
smaller number. Or, if you place debug or assert statements in 
`wget_bar_slot_begin()`, you will see that this function is called multiple 
times for the same slot while the previous file is still being downloaded.

So, now, for the solution; we have two options:

1. We continue using a single progress slot per thread and try to multiplex it 
to show the status of multiple files. This is similar to the solution 
@jmoellers suggested. However, in this case, the progress bar will flicker a 
lot as multiple files will have to be displayed on it. This especially does not 
scale with the default setting of 30 multiplexed streams per connection.

2. We start drawing one progress bar slot for each file being downloaded. This 
means, if a single thread is downloading multiple files, each of those is 
represented on a separate progress slot. This solution again doesn't scale very 
well with high numbers of multiplexed streams. But I still think this solution 
provides for  better user experience.

There is another problem with solution 2, we need to somehow maintain a "pool" 
of free slots and assign one to a downloading context. We can no longer simply 
state the slot number from the Wget2 client. We must also somehow keep the last 
slot out of the "free pool" for the status line.

I haven't written much of the code for this, but I did spend some time thinking 
about the potential solutions.

Reply to this email directly or view it on GitLab: 
You're receiving this email because of your account on gitlab.com.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]