[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Recursive downloading of pages through the "action" attributes of th
From: |
BERBAR Florian |
Subject: |
Re: Recursive downloading of pages through the "action" attributes of the following "form" tags |
Date: |
Mon, 15 May 2023 01:19:47 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 |
I reproduce this issue with the lastest version (1.21.4) with the
following pages :
form.html:<html>
form.html: <body>
form.html: <form action="./post.html" >
form.html: <input name="ff" type='text' />
form.html: <input name='tt' type='submit' />
form.html: </form>
form.html: </body>
form.html:</html>
post.html:<html>
post.html: <body>
post.html: <a href="./link.html">link<a/>
post.html: </body>
post.html:</html>
link.html:<html>
link.html: <body>
link.html: <a href="./form.html">form<a/>
link.html: </body>
link.html:</html>
A basic recusive command only downloads the form.html page when I
expected to download all 3 pages.
wget-1.21.4$ ./src/wget -r http://127.0.0.1/form.html
--2023-05-15 01:08:55-- http://127.0.0.1/form.html
Connecting to 127.0.0.1:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 145 [text/html]
Saving to: '127.0.0.1/form.html'
127.0.0.1/form.html 100%[===================>] 145 --.-KB/s in 0s
2023-05-15 01:08:55 (18.6 MB/s) - '127.0.0.1/form.html' saved [145/145]
FINISHED --2023-05-15 01:08:55--
Regards,
Florian
On 4/17/23 21:22, BERBAR Florian wrote:
Hi folk,
I have question about recursive downloading of webpages. Trying to
download all pages from a website using recursing option (--recursive)
on wget 1.21, the webpages processing seems to don't follow form
"action" attributs of "form" tags.
- Does it be the expecting behavior?
- Is there a combination of options to download all pages of a website
with the attribut "action"?
Exemple with 3 HTML pages :
- Page 1 - form.html : HTML form with "action" attribut pointing to
"Page 2"
- Page 2 - post.html : HTML page with a link to "Page 3".
- Page 3 - link.html : HTML page without link.
I tried this command to download all tree pages but only "Page 1" was
downloaded:
$ wget -r https://host/form.html
I tried "--follow-tags=form" option but the same behavior was observed.
Regards,
Florian
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: Recursive downloading of pages through the "action" attributes of the following "form" tags,
BERBAR Florian <=