[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] wget not stop when using -e robots=off option
From: |
Sethi Badhan |
Subject: |
[Bug-wget] wget not stop when using -e robots=off option |
Date: |
Sun, 27 Nov 2016 17:40:09 -0800 |
Hello
when i try to run simply wget in for loop it works fine but when i try to
run using -e robots=off it not stopping and it downloading pages
recursively even i have set the limit for 'for ' loop it is not stoping
after that limit here is my code
#!/bin/bash
lynx --dump https://en.wikipedia.org/wiki/Cloud_computing |awk
'/http/{print $2}'| grep https://en. | grep -v
'.svg\|.png\|.jpg\|.pdf\|.JPG\|.php' >Pages.txt
grep -vwE "(
http://www.enterprisecioforum.com/en/blogs/gabriellowy/value-data-platform-service-dpaas)"
Pages.txt > newpage.txt
rm Pages.txt
egrep -v "#|$^" newpage.txt>try.txt
awk '!a[$0]++' try.txt>new.txt
rm newpage.txt
rm try.txt
mkdir -p htmlpagesnew
cd htmlpagesnew
j=0
for i in $( cat ../new.txt );
do
if [ $j -lt 10 ];
then
let j=j+1;
echo $j
wget -N -nd -r $i -e robots=off --wait=.25 ;
fi
done
i hope you will reply soon.
Thanks
Regards
Sethi
- [Bug-wget] wget not stop when using -e robots=off option,
Sethi Badhan <=