[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] action on "not able to connect to proxy"
From: |
Mohan gupta |
Subject: |
Re: [Bug-wget] action on "not able to connect to proxy" |
Date: |
Tue, 13 Oct 2009 12:39:34 +0530 |
sorry for sending a confusing message....
What i meant was that , what actually timeout retries means , say i
give a url of a remote site and that url is no longer online yielding
something like "page not found " error...in that case i believe wget
will retry for "maxtries" time as set in the config file ..and after
that it will blacklist this url and will move to the next in the
queue. thats absolutely fine .
But consider a case where I have a url queue ( with lots of url) ,now
if the proxy server is unreachable will wget exit saying "unable to
connect to proxy server" or it will like above make maxtries and will
blacklist the urls one after the other.
See there is difference between two cases, one the url is inactive and
the other we are unable to connect to the internet in the first place.
so whats the action in the second case?
Mohan
On 10/13/09, Micah Cowan <address@hidden> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Mohan gupta wrote:
>> hello everyone ,
>> greetings on my first mail to the list!!
>>
>> Well I am using wget as a full webcrawler .Now I am behind my
>> university proxy and i wanted to know what is the action taken by wget
>> when the proxy server is off. I have configured it to use a proxy
>> server and for retrieving urls it has been configured to make atmost 5
>> attempts.
>> In my system wget takes url from a database and retrieves them. When
>> say proxy server is off i expect wget to exit() itself and not just
>> pick url's from database and keep getting timedout on them.
>>
>> Is it what it really does??
>>
>> I do not want wget to erroneously destroy my database of urls.!!
>
> How in the world can wget destroy anything? Wget doesn't even know about
> your database of URLs. If you're just throwing them out after a download
> attempt, without trying to actually verify the downloads, then how is
> that Wget's fault?
>
> And, I'm unsure how you expect wget to behave one way when proxy is "on"
> (timeout and try again), and another when it's "off". How is wget
> supposed to know the difference? If the proxy is down, then ideally the
> network should be configured to send back "No route to host" packets,
> etc. But if wget doesn't receive any packets at all in response to
> connection attempts, then how is it supposed to tell the difference
> between a temporary network failure and a switched-off machine? You can
> of course adjust wget's setting for how long it is willing to wait for a
> timeout (which is currently quite liberal), but there is obviously no
> way for it to understand that the machine is switched off, and that it
> shouldn't bother continuing to try.
>
> - --
> Micah J. Cowan
> Programmer, musician, typesetting enthusiast, gamer.
> Maintainer of GNU Wget and GNU Teseq
> http://micah.cowan.name/
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkrTpwEACgkQ7M8hyUobTrFo6gCeNQCh5NTYnnPRMV/3KhN0DPIY
> IT4AniwnQPNeZOef86c+MM7u6SmI6kTY
> =/B25
> -----END PGP SIGNATURE-----
>