Need PHP snippet to detect bot

Status
Not open for further replies.

guerilla

All we do is win
Aug 18, 2007
11,426
428
0
No
I want to change a CSS attribute based on bot detection. It's inline CSS.

So if my code currently is

Code:
<div style="display:none;">Stuff</div>
I want bots to see

Code:
<div>Stuff</div>
Basically, I want to do a poor man's job of cloaking by user agent on a paragraph of keyword rich text.

Hoping it is something like

Code:
<?php if useragent=slurp msnbot googlebot
<div style="display:none;">
else
<div>
?>
Rep for help. Thanks in advance.
 


Try something like

Code:
function is_bot()
{

if( preg_match( '/(bot|slurp)/i', $_SERVER['HTTP_REFERER'] ) )
{
   return 1;
}

return 0;
}


// ... and in the code

if( is_bot() ){ output the display:none }
else { don't output it }
This should catch Googlebot, MSN and Yahoo
 
Last edited:
  • Like
Reactions: guerilla
You can check the IP too. I'd check the IP and the user agent to see if it's a search bot.

Code:
<?php

$botIps = array();

$botIps = array_merge($botIps, array_filter(array_map('trim', file('google.txt')), 'filter_botip_array' ));
$botIps = array_merge($botIps, array_filter(array_map('trim', file('yahoo.txt')), 'filter_botip_array' ));
// more here


function isBotIP() {
  
  global $botIps;

  if (in_array($_SERVER['REMOTE_ADDR'], $botIps))
  {
    return true;
  }

  return false;
}


function filter_botip_array($input)
{
  return !preg_match('/^#/', $input);
}


?>
IP Addresses of Search Engine Spiders

I can't vouch for the accuracy of that database though.
 
  • Like
Reactions: guerilla
Thanks guys, logic I gotta wait another 12 hours until I can rep again. I will pay up.
 
why not simply apply an id to the div and use css to mask (bots don't render html)

<div id="z">stuff</div>

#z{display:none}
 
  • Like
Reactions: guerilla
wait, your bots dont render HTML? Fucking extra mile.
:music06:

Yes, I could do it with CSS. But unless someone can confirm that Google can't parse CSS and has no intentions to parse CSS, then I would rather cloak.

Measure twice, cut once, as my great grand pappy would say.
 
Google *has been known to parse inline css to stop this type of stuff but i can't say about linked css. so cloak.... or be a champ and add a entry for the css file into robots.txt to stop it accessing it.
 
when I used to run TGP's and free sites we would put an invisible link at the top of the page, only hitbots would click it.
 
Status
Not open for further replies.