|
From: | Jason Todd Slack-Moehrle |
Subject: | [Bug-wget] Wget Starting Questions |
Date: | Sun, 19 Apr 2009 09:26:29 -0700 |
Hi All,I have some starting Wget questions that I am hoping to gain insight about.
I want to start at Dmoz.org and follow links for entertainment (like concerts, art gallery events, etc) and examine the link to see if I should get data back about it and from it.
My questions:1. Can Wget start at a given URL and examine every link (based upon my criteria)? (obviously I can write Case or If/Else or While to do this)
2. If I find a link that has certain keywords that I find of interest, can I hit that link of interest and get information from that page?
3. How do I get the information about the link of interest and its content of interest into a MySQL database? (I know ColdFusion and MySQL and PHP). I think what I am asking is how do I get back to my database from a crawler?
4. I bought Webbots, spiders and screen scrapers in PHP and so far it is interesting, but I am wondering what best practices are..
Am I making any sense? -Jason
[Prev in Thread] | Current Thread | [Next in Thread] |