Freelance Project

All freelance projects at One Location


eng_468x60_peopleflag  eng_468x60_step

Url Harvesting Script 02.12.08


I need a script or desktop application (for windows vista) to harvest website addresses (URLs) for me. I will enter a keyword phrase like “cheap hosting” and up to three directory URLs (eg. http://dir.yahoo.com/Business_and_Economy/Business_to_Business/Communications_and_Networking/Internet_and_World_Wide_Web/Network_Service_Providers/Hosting/Website_Hosting/) for each keyword phrase.

For each keyword phrase, you will then need to get website addresses from the following sources for me (using “cheap hosting” as an example) -

1) http://www.google.com/sponsoredlinks?q=cheap+hosting&btnG=Search+Sponsored+Links – Up to the first 100 results
2) Sponsored results that appear for http://search.yahoo.com/search?p=cheap+hosting – Up to the first 100 results
3) First 100 Google search results for “cheap hosting”
4) First 100 Yahoo results for “cheap hosting”
5) The first 100 results available from each directory (up to three directories per keyword phrase) specified.
6) The first 100 results from Google Maps
7) The first 100 results from http://www.google.com/search?q=allinurl%3Acheap+hosting – filtered to show only the URLs that actually have one or more of the keywords in the domain name itself. (So cheapcar.com and fasthosting.com is ok, but not computers.com/cheaphosting.htm)

I will need the ALL results in the form of website.com (NOT website.com/djdjd/ururu.htm)in a csv text file, separated into the five categories above.

I will do about 10 keyword phrases at a time.

The script must have a setting to specify a random delay in seconds between searches to avoid being blocked by Google.

  • Digg
  • Sphinn
  • FriendFeed
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • LinkedIn
  • StumbleUpon
  • Yahoo! Buzz


If you enjoyed this post, make sure you
Subscribe to my RSS feed!


  • Freelance Arena
  • Unique Premium WordPress Themes
  • teliad - the marketplace for text links