I am coding a scraping API. It's for a service that I am going to launch. Development has been ongoing for 3 days so far.
Although the API front-end is not started yet, the back-end (processing) is in a late Alpha stage now, and is capable of handling scrapes for the source of any URL that doesn't require a POST. In simple terms it cannot scrape data from a page that requires form input (yet).
I would like to test what I have implemented so far with real scraping requests, as I'm not sure if I am going too easy on myself with the tests I have come up with so far.
If anybody has any sites that they would like some data scraped from, I can do this for you (FREE!) as part of my testing.
The details I require are as follows -
One thing I will mention is that this is not an unlimited offer, in that I am not necessarily offering it to unlimited people, nor am I offering to scrape an unlimited amount of data. I think a reasonable amount of URLs to scrape is 500. Remember this is a fully hosted service.
Any questions, or requests for scrapes, just reply in this thread. If anything you want to scrape is non-public, I'm sure you know where the PM button is!
Although the API front-end is not started yet, the back-end (processing) is in a late Alpha stage now, and is capable of handling scrapes for the source of any URL that doesn't require a POST. In simple terms it cannot scrape data from a page that requires form input (yet).
I would like to test what I have implemented so far with real scraping requests, as I'm not sure if I am going too easy on myself with the tests I have come up with so far.
If anybody has any sites that they would like some data scraped from, I can do this for you (FREE!) as part of my testing.
The details I require are as follows -
Starting URL - The URL of "Page 1" of the scrape, subsequent URLs will be calculated by the script. If the URL needs to have variable parameters passed to it (eg. example.com?example=parameter) then include the list of parameters that need to be used.
Required Data - Some sort of identifier for the data that should be scraped, for example if the data required is an email address and it is labelled "E-Mail Addy" on the target website, then "E-Mail Addy" would be the required data. Of course it is fine to have multiple pieces of required data.
Required Data - Some sort of identifier for the data that should be scraped, for example if the data required is an email address and it is labelled "E-Mail Addy" on the target website, then "E-Mail Addy" would be the required data. Of course it is fine to have multiple pieces of required data.
One thing I will mention is that this is not an unlimited offer, in that I am not necessarily offering it to unlimited people, nor am I offering to scrape an unlimited amount of data. I think a reasonable amount of URLs to scrape is 500. Remember this is a fully hosted service.
Any questions, or requests for scrapes, just reply in this thread. If anything you want to scrape is non-public, I'm sure you know where the PM button is!