[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Miscellaneous thoughts & concerns
From: |
Tim Rühsen |
Subject: |
Re: [Bug-wget] Miscellaneous thoughts & concerns |
Date: |
Sat, 7 Apr 2018 19:09:11 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 |
WSL fix for TLS:
Search libwget/ssl_gnutls.c for EINPROGRESS and extend the code to also
check errno for 22 and 32.
There are just two places in _ssl_writev().
After these changes TLS works for me including --tls-resume.
But you still have to use --no-tcp-fastopen.
Regards, Tim
On 07.04.2018 04:31, Jeffrey Fetterman wrote:
> > The number of parallel downloads ? --max-threads=n
>
> Okay, well, when I was running it earlier, I was noticing an entire
> directory of pdfs slowly getting larger every time I refreshed the
> directory, and there were something like 30 in there. It wasn't just
> five. I was very confused and I'm not sure what's going on there, and
> I really would like it to not do that.
>
>
> > Likely the WSL issue is also affecting the TLS layer. TLS resume is
> considered 'insecure', thus we have it disabled by default. There
> still is TLS False Start enabled by default.
>
> Are you implying TLS False Start will perform the same function as TLS
> Resume?
>
>
> > You likely want to use --progress=bar. --force-progress is to enable
> the progress bar even when redirecting (e.g. to a log file)address@hidden,
> we shoudl adjust the behavior to be the same as in Wget1.x.
>
> That does work but it's very buggy. Only one shows at a time and it
> doesn't even always show the file that is downloading. Like it'll seem
> to be downloading a txt file when it's really downloading several
> larger files in the background.
>
>
> > Did you build with http/2 and compression support ?
>
> Yes, why?
>
>
> P.S. I'm willing to help out with your documentation if you push some
> stuff that makes my life on WSL a little less painful, haha. I'd run
> this in a VM in an instant but I feel like that would be a bottleneck
> on what's supposed to be a high performance program. Speaking of high
> performance, just how much am I missing out on by not being able to
> take advantage of tcp fast open?
>
>
> On Fri, Apr 6, 2018 at 5:01 PM, Tim Rühsen <address@hidden
> <mailto:address@hidden>> wrote:
>
> Hi Jeffrey,
>
>
> thanks for your feedback !
>
>
> On 06.04.2018 23:30, Jeffrey Fetterman wrote:
> > Thanks to the fix that Tim posted on gitlab, I've got wget2
> running just
> > fine in WSL. Unfortunately it means I don't have TCP Fast Open,
> but given
> > how fast it's downloading a ton of files at once, it seems like
> it must've
> > been only a small gain.
> >
> >
> > I've come across a few annoyances however.
> >
> > 1. There doesn't seem to be any way to control the size of the
> download
> > queue, which I dislike because I want to download a lot of large
> files at
> > once and I wish it'd just focus on a few at a time, rather than
> over a
> > dozen.
> The number of parallel downloads ? --max-threads=n
>
> > 3. Doing a TLS resume will cause a 'Failed to write 305 bytes
> (32: Broken
> > pipe) error to be thrown', seems to be related to how certificate
> > verification is handled upon resume, but I was worried at first
> that the
> > WLS problems were rearing their ugly head again.
> Likely the WSL issue is also affecting the TLS layer. TLS resume is
> considered 'insecure',
> thus we have it disabled by default. There still is TLS False Start
> enabled by default.
>
>
> > 3. --no-check-certificate causes significantly more errors about
> how the
> > certificate issuer isn't trusted to be thrown (even though it's not
> > supposed to be doing anything related to certificates).
> Maybe a bit too verbose - these should be warnings, not errors.
>
> > 4. --force-progress doesn't seem to do anything despite being
> recognized as
> > a valid paramater, using it in conjunction with -nv is no longer
> beneficial.
> You likely want to use --progress=bar. --force-progress is to
> enable the
> progress bar even when redirecting (e.g. to a log file).
> @Darshit, we shoudl adjust the behavior to be the same as in Wget1.x.
>
> > 5. The documentation is unclear as to how to disable things that are
> > enabled by default. Am I to assume that --robots=off is
> equivalent to -e
> > robots=off?
>
> -e robots=off should still work. We also allow --robots=off or
> --no-robots.
>
> > 6. The documentation doesn't document being able to use 'M' for
> chunk-size,
> > e.g. --chunk-size=2M
>
> The wget2 documentation has to be brushed up - one of the blockers for
> the first release.
>
> >
> > 7. The documentation's instructions regarding --progress is all
> wrong.
> I'll take a look the next days.
>
> >
> > 8. The http/https proxy options return as unknown options
> despite being in
> > the documentation.
> Yeah, the docs... see above. Also, proxy support is currently limited.
>
>
> > Lastly I'd like someone to look at the command I've come up with
> and offer
> > me critiques (and perhaps help me address some of the remarks
> above if
> > possible).
>
> No need for --continue.
> Think about using TLS Session Resumption.
> --domains is not needed in your example.
>
> Did you build with http/2 and compression support ?
>
> Regards, Tim
> > #!/bin/bash
> >
> > wget2 \
> > `#WSL compatibility` \
> > --restrict-file-names=windows --no-tcp-fastopen \
> > \
> > `#No certificate checking` \
> > --no-check-certificate \
> > \
> > `#Scrape the whole site` \
> > --continue --mirror --adjust-extension \
> > \
> > `#Local viewing` \
> > --convert-links --backup-converted \
> > \
> > `#Efficient resuming` \
> > --tls-resume --tls-session-file=.\tls.session \
> > \
> > `#Chunk-based downloading` \
> > --chunk-size=2M \
> > \
> > `#Swiper no swiping` \
> > --robots=off --random-wait \
> > \
> > `#Target` \
> > --domains=example.com <http://example.com> example.com
> <http://example.com>
> >
>
>
>
- [Bug-wget] Miscellaneous thoughts & concerns, Jeffrey Fetterman, 2018/04/06
- Re: [Bug-wget] Miscellaneous thoughts & concerns, Tim Rühsen, 2018/04/06
- Re: [Bug-wget] Miscellaneous thoughts & concerns, Jeffrey Fetterman, 2018/04/06
- Re: [Bug-wget] Miscellaneous thoughts & concerns, Darshit Shah, 2018/04/07
- Re: [Bug-wget] Miscellaneous thoughts & concerns, Jeffrey Fetterman, 2018/04/07
- Re: [Bug-wget] Miscellaneous thoughts & concerns, Darshit Shah, 2018/04/08
- Re: [Bug-wget] Miscellaneous thoughts & concerns, Jeffrey Fetterman, 2018/04/08
- Re: [Bug-wget] Miscellaneous thoughts & concerns, Jeffrey Fetterman, 2018/04/09
- Re: [Bug-wget] Miscellaneous thoughts & concerns, Tim Rühsen, 2018/04/09
- Re: [Bug-wget] Miscellaneous thoughts & concerns, Jeffrey Fetterman, 2018/04/09