I already have my crawler engine (can handle forms, most javascript, lots of filtering options bla bla bla) and user interface, so I have begun wondering...
Is there, in general, any interest in a product that e.g. can scrape off e.g. product catalogs? (or whatever really) into CSV? SQL?
The crawler would simply go through entire website (possibly using the user defined filters to only crawl the wanted pages)... This would work well...
However, the user would still need to define what exactly he/she wanted to scrape off those pages... And I am wondering if regular expressions are too complex? Then again, many people interested in such software may already know regular expressions....
Is there, in general, any interest in a product that e.g. can scrape off e.g. product catalogs? (or whatever really) into CSV? SQL?
The crawler would simply go through entire website (possibly using the user defined filters to only crawl the wanted pages)... This would work well...
However, the user would still need to define what exactly he/she wanted to scrape off those pages... And I am wondering if regular expressions are too complex? Then again, many people interested in such software may already know regular expressions....