wget-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Wget-dev] wget2 | Robot fix (!454)


From: Archit Pandey
Subject: [Wget-dev] wget2 | Robot fix (!454)
Date: Tue, 22 Oct 2019 05:46:22 +0000


Archit Pandey created a merge request: 
https://gitlab.com/gnuwget/wget2/merge_requests/454

Project:Branches: archit-p/wget2:robot-fix to gnuwget/wget2:master
Author:    Archit Pandey



Hello maintainers!

This merge request addresses https://gitlab.com/gnuwget/wget2/issues/456

Description of files changed:
1. `src/wget.c (add_url_to_queue, add_url)`: robots.txt is downloaded when 
config.recursive option is set, without checking for config.robots option.
2. `src/wget.c (add_url)`: config.robots option is checked when updating URLs 
not to follow
3. `tests/test-robots-off.c`: Most of the file was borrowed from 
`tests/test-robots.c`. It tests whether robots.txt is downloaded even with 
robots=off, and that the disallowed URLs are not respected.
4. `tests/test-iri-percent.c`: changing the robots=off behavior broke the 
`test-iri-percent` testcase since it wasn't expecting robots.txt to be 
downloaded. Adding robots.txt to expected files ensures it passes now.

I have done a clean install after making these changes. Also run `make check`, 
66/69 test cases PASS, 3/69 are skipped.

It appears to me that this was a very quick fix, there might be better ways to 
do the same. Please point out any gaps to this approach, or suggestions on how 
to improve.

Thanks!
```
### Approver's checklist:

* [ ] The author has submitted the FSF Copyright Assignment and is listed in 
AUTHORS
* [ ] There is a test suite reasonably covering new functionality or 
modifications
* [ ] Function naming, parameters, return values, types, etc., are consistent 
with existing code
* [ ] This feature/change has adequate documentation added (if appropriate)
* [ ] No obvious mistakes / misspelling in the code

-- 
Reply to this email directly or view it on GitLab: 
https://gitlab.com/gnuwget/wget2/merge_requests/454
You're receiving this email because of your account on gitlab.com.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]