php - Google proxy is a fake crawler? For example: google-proxy-66-249-81-131.google.com -
*edit: solution question below question because not possible post answer. people decide close question.*
recently discover variants of google proxy visits sites. doubt these legal google crawlers because these crawlers not behind proxy (like hostname describes) , identify browser. hostname formatted similar/like google bot string 'proxy' added it.
my php blocking class blocks these crawlers, correct block these ones? , these google or fake?
here info 1 of these crawlers:
blockedip notifier report - ip:66.249.81.131:: has been blocked ticket id : {evnt_136877_2013040520130402_33147_10348} event type : access blocked event date : 04/05/2013 - 19:17:47 (server date-time) event counter : first occurring processed url : http://streambutler.net/ url : http://www.google.com/search domain : streambutler.net domain ip : 95.170.70.213 visitor ip : 66.249.81.131 proxy ip : 66.249.81.131 critical : yes action required : no additional information problem : bad proxy - via 66.249.81.131 hostname : google-proxy-66-249-81-131.google.com block : yes refferer : http://www.google.com/search agentstring : mozilla/5.0 (x11; linux x86_64) applewebkit/537.4 (khtml, g... browser : chrome 22.0.1229 platform : linux robot : no mobile : no tablet : no console : no crawler : no agent_type : browser agent_name : chrome agent_version : 22.0.1229 os_type : linux os_name : linux agent_languagetag : en status : ok request : 66.249.81.131 languagecode : country : united states region : california city : mountain view zipcode : 94043 latitude : 37.406 longitude : -122.079 timezone : -07:00 available : \'http areacode : 0 dmacode : 0 continentcode : na currencycode : usd currencysymbol : $ currencysymbol_utf8 : $ currencyconverter : 1 extended : 1 organization : null
other variants found
- google-proxy-66-249-81-131.google.com (identifies firefox 6.0 ???)
- google-proxy-66-249-81-148.google.com (tries access javascript file)
- google-proxy-66-249-81-131.google.com
- google-proxy-66-249-81-111.google.com (tries access javascript file)
- google-proxy-66-249-81-164.google.com
edit: next 1 weird one, firefox 6.0 on windows 7 , same ip example above not proxy in next log? if mobile proxy, weird or not?
ticket id : {evnt_164838_2013040520130402_33147_10348} event type : access blocked event date : 04/05/2013 - 19:19:07 (server date-time) event counter : first occurring processed url : http://streambutler.net/ url : unknown or direct link domain : streambutler.net domain ip : 95.170.70.213 visitor ip : 66.249.81.131 proxy ip : (not present) critical : yes action required : no additional information problem : blocked server ip address (analysis) - 66.249.81.131 hostname : google-proxy-66-249-81-131.google.com block : yes refferer : (direct access) agentstring : mozilla/5.0 (windows nt 6.1; rv:6.0) gecko/20110814 firefox/6.0 ... browser : firefox 6.0 platform : windows 7 robot : no mobile : no tablet : no console : no crawler : no agent_type : browser agent_name : firefox agent_version : 6.0 os_type : windows os_name : windows 7 agent_languagetag : en status : ok request : 66.249.81.131 languagecode : country : united states region : california city : mountain view zipcode : 94043 latitude : 37.406 longitude : -122.079 timezone : -07:00 available : \'http areacode : 0 dmacode : 0 continentcode : na currencycode : usd currencysymbol : $ currencysymbol_utf8 : $ currencyconverter : 1 extended : 1 organization : null
edit: solution:
got it! these 'crawlers' not crawlers part of live website preview used in google search engine.
i have tried this, show 1 of websites in preview , yes, there is, received blockedip message.
if want users able view preview of website, have accept these 'crawlers'.
like others said: "the root domain of url google.com , can't spoofed".
conclusion: can trust these bot's or crawlers , used show preview in google search.
i haven't confirmed, suspect these ips may associated google's data compression proxy google chrome mobile:
https://developers.google.com/chrome/mobile/docs/data-compression
if case, blocking them cause site display incorrectly innocent mobile users.
it may associated google+ crawler used grab snippets pages using google +1 button:
https://code.google.com/p/google-plus-platform/issues/detail?id=178
bottom line is, these ips used web requests kicked off internal google stuff. not public web proxies.
Comments
Post a Comment