scraping site explorer

LogicFlux

Donkey Fucker
Jul 7, 2008
1,054
23
0
Across 110th Street
What are the downsides/benefits to using the API besides the obvious of having to maintain a parser? Writing a parser isn't difficult, I've been using Perl with HTML::Parser so even if it breaks, replacing it should be pretty simple. The API doesn't even provide some key functionality I want anyway so I'm thinking of not even using it. This is in regards to scripts for personal use, not for a product that will have to be updated when the parser breaks.
 


I'm not really sure what the question is, as it sounds like you already answered it for yourself.
 
Do they throttle access to the site explorer site? They do to their SERPs, and will spit back status code 900 or 999 or something if you make too many requests. I'm guessing the API wont do that.
 
I think you make up to 5000 requests per day with the yahoo Site explorer API. However, you can only request up to 100 results at a time. You can specify the starting result and make a few different requests, but the starting limit is 1000. So basically with the API, you can only get the first 1100 results. That is the only downfall that I really see. I've been using the API for sometime, and the output XML has never changed on me.