I need to figure out the best way to do MASSIVE scraping of Google without getting blocked. I've looked into using anonymous proxy services such as anonymizer.com but the cost is prohibitive.
My next thought was to acquire 100+ IPs and setup my own proxy server that just rotates through all the IPs when sending requests. Not sure how easy it will be to acquire 100+ IPs since you need to provide a valid reason for needing so many IPs these days...
Another thought was to literally setup 100 different hosting accounts at different providers, each with their own IP, and just program my system to route all the requests through these. The cost wouldn't be too bad, but setting up that many accounts and programming all the logins etc. into my system will be a real PITA.
Anyone have any ideas for me? There are a lot of different tools out there that do large scale scraping of Google - I'm just wondering how they do it. Thanks!
My next thought was to acquire 100+ IPs and setup my own proxy server that just rotates through all the IPs when sending requests. Not sure how easy it will be to acquire 100+ IPs since you need to provide a valid reason for needing so many IPs these days...
Another thought was to literally setup 100 different hosting accounts at different providers, each with their own IP, and just program my system to route all the requests through these. The cost wouldn't be too bad, but setting up that many accounts and programming all the logins etc. into my system will be a real PITA.
Anyone have any ideas for me? There are a lot of different tools out there that do large scale scraping of Google - I'm just wondering how they do it. Thanks!