New Google Keyword tool

nos10

New member
Mar 7, 2011
3
0
0
hey guys,


I have setup a working scraper as the guide in this thread.
http://www.wickedfire.com/design-de...le-keyword-tool-willing-share.html#post900237

everything works fine but when I Download the keyword data it seems something has changed with the url data since this post was made.

when i sniff the packets and add the url to the app it will work for a few hours then it seems to expire.

this is the step by step method I'm using with vb.net.

Basically it break down like this;

NOTE: All request should save the returned cookie data and send any previously saved cookie data.

1) Login to Google. I use this URL 'https://www.google.com/accounts/ServiceLoginAuth', but there are a few different ones.

2) Request this url "https://adwords.google.com/um/StartNewLogin?sourceid=awo&subid=ww-en-et-gaia" and follow all the redirects, there are 6 (this is setting cookies need for Adwords).

3) When the last redirect complete and the HTML data is returned, extract the values for __u= and __c=. I use regex but any way will do.
Code:
u=(.*?)&__c=(.*?)&
4) Request this URL "https://adwords.google.com/o/Targeting/Explorer?stylePrefOverride=2&__u=&__c=" adding the values extracted for 'u' and 'c'. The URL should look something like "https://adwords.google.com/o/Targeting/Explorer?stylePrefOverride=2&__u=1654564568&__c=87 94964564"

5) Extract the value for token, again I use regex
Code:
token:'(.*?)'
6) Send a POST request to the URL 'https://adwords.google.com/o/Targeting/file/DownloadAll' with the POST data outlined below.

7) (OPTIONAL) Decompress the data and format it with a CSV function.

POST data;

$u is the __u value you extracted in step 3
$c is the __c value you extracted in step 3
$format can be CSV, GZIPPED_CSV, CSVFOREXCEL, XML or TSV
$county is a uppercase two letter country code, US, GB, DE, etc
$language is a lowercase two letter language code, en, jp, etc
$keyword is the keyword (do not url encode the keywords, leave a space as a space, don't change it to %20 or +)
$token is the token you extracted in step 5

This is the POST string for keywords and this is the url in question.
__u=$u&__c=$c&format=$format&selector=5|1|37|https://adwords.google.com/o/Targeting/|0BC91872D430CA67A5237BB62909C258|106|17t|110|en_US|ul|12g|11f|17k|11i|KEYWORD|COMPETITION|GLOBAL_MONTHLY_SEARCHES|AVERAGE_TARGETED_MONTHLY_SEARCHES|TARGETED_MONTHLY_SEARCHES|IDEA_TYPE|AD_SHARE|EXTRACTED_FROM_WEBPAGE|SEARCH_SHARE|KEYWORD_CATEGORY|NGRAM_GROUP|12x|19z|hh|$county|13a|hu|$language|139|17q|bm|13h|16x|17o|bl|$keyword|1|2|3|4|18|5|0|6|7|50|0|8|0|0|9|0|10|9|11|12|11|13|11|14|11|15|11|16|11|17|11|18|11|19|11|20|10|3|-8|11|21|11|22|10|4|23|24|1|25|26|0|27|24|1|28|29|30|31|32|2|33|34|35|36|-29|37|0|0|0|&token=$token
its nothing to do with the token as I have checked that. It seems to these values changing.
1:
0BC91872D430CA67A5237BB62909C258


2:
|106|17t|110|en_US|ul|12g|11f|17k|11i|

I can not find where these values are coming from it seems they are set via an ajax call but I cant pin point how or when.

If anyone out there can give me some hints or tips on this method of scraping the new keyword tool or even other methods if there are any that would be great.

thanks.
 


what method are u using to scrap the data? are you pulling it from the download keywords as file or scraping from the keyword results?.

can you please explain how your doing this please?

thanks
 
I'll post in this topic later. The HTTP GETs POSTs remain the same more or less, only major change is today they switched to recaptcha. Goodbye google captcha, sweet uncrackable above 6% princess :(
 
yes i see they changed the captcha but I'm using the login method so I do not use captcha anyway.

I would like to see the http POST and GET that you are using.

thanks.