[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Recursive download: Page requirements when spanning hosts
From: |
Micah Cowan |
Subject: |
Re: [Bug-wget] Recursive download: Page requirements when spanning hosts? |
Date: |
Fri, 23 Oct 2009 17:14:16 -0700 |
User-agent: |
Thunderbird 2.0.0.23 (X11/20090817) |
address@hidden wrote:
> I want to download a few sites, and have some
> questions about the best way to do it...
>
> I'll be doing a recursive-download to infinity,
> but limited to the current directory downwards
> (-np No Parent). I'll also download the page
> requirements (-p).
>
> wget -r -l inf -np -p http://domain.name/index.html
> (I'll also be adding to limit-rate and a bit of
> pause between each download.)
>
> My problem is that I want to have all page requirements,
> also if they span hosts, or are located above/parallell
> to the site (directory) I'll be working in (it's on
> GeoCities, so there are many "parallell" sites). As long
> as it's "part of" a page in the directory I'm working in it
> should be downloaded, but not else. An additional problem,
> is that there may be lists of links that actually points to
> those directories/hosts; but nothing should be downloaded
> unless it's part of a page.
>
> Would this be possible (at least partially, I understand
> if it's a problem getting around the no-parent)?
This is not currently possible, I'm afraid.
--
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
Maintainer of GNU Wget and GNU Teseq
http://micah.cowan.name/