php - Google proxy is a fake crawler? For example: google-proxy-66-249-81-131.google.com -


*edit: solution question below question because not possible post answer. people decide close question.*

recently discover variants of google proxy visits sites. doubt these legal google crawlers because these crawlers not behind proxy (like hostname describes) , identify browser. hostname formatted similar/like google bot string 'proxy' added it.

my php blocking class blocks these crawlers, correct block these ones? , these google or fake?

here info 1 of these crawlers:

blockedip notifier report - ip:66.249.81.131:: has been blocked  ticket id : {evnt_136877_2013040520130402_33147_10348}   event type : access blocked   event date : 04/05/2013 - 19:17:47 (server date-time)   event counter : first occurring   processed url : http://streambutler.net/   url : http://www.google.com/search   domain : streambutler.net  domain ip : 95.170.70.213   visitor ip : 66.249.81.131   proxy ip : 66.249.81.131    critical : yes   action required : no     additional information problem : bad proxy - via 66.249.81.131  hostname : google-proxy-66-249-81-131.google.com  block : yes  refferer : http://www.google.com/search  agentstring : mozilla/5.0 (x11; linux x86_64) applewebkit/537.4 (khtml, g...  browser : chrome 22.0.1229  platform : linux  robot : no  mobile : no  tablet : no  console : no  crawler : no  agent_type : browser  agent_name : chrome  agent_version : 22.0.1229  os_type : linux  os_name : linux  agent_languagetag : en  status : ok  request : 66.249.81.131  languagecode :  country : united states  region : california  city : mountain view  zipcode : 94043  latitude : 37.406  longitude : -122.079  timezone : -07:00   available  : \'http  areacode : 0  dmacode : 0  continentcode : na  currencycode : usd  currencysymbol : $  currencysymbol_utf8 : $  currencyconverter : 1  extended : 1  organization : null  

other variants found

  • google-proxy-66-249-81-131.google.com (identifies firefox 6.0 ???)
  • google-proxy-66-249-81-148.google.com (tries access javascript file)
  • google-proxy-66-249-81-131.google.com
  • google-proxy-66-249-81-111.google.com (tries access javascript file)
  • google-proxy-66-249-81-164.google.com

edit: next 1 weird one, firefox 6.0 on windows 7 , same ip example above not proxy in next log? if mobile proxy, weird or not?

ticket id : {evnt_164838_2013040520130402_33147_10348}   event type : access blocked   event date : 04/05/2013 - 19:19:07 (server date-time)   event counter : first occurring   processed url : http://streambutler.net/   url : unknown or direct link   domain : streambutler.net  domain ip : 95.170.70.213   visitor ip : 66.249.81.131   proxy ip : (not present)    critical : yes   action required : no     additional information problem : blocked server ip address (analysis) - 66.249.81.131  hostname : google-proxy-66-249-81-131.google.com  block : yes  refferer : (direct access)  agentstring : mozilla/5.0 (windows nt 6.1; rv:6.0) gecko/20110814 firefox/6.0 ...  browser : firefox 6.0  platform : windows 7  robot : no  mobile : no  tablet : no  console : no  crawler : no  agent_type : browser  agent_name : firefox  agent_version : 6.0  os_type : windows  os_name : windows 7  agent_languagetag : en  status : ok  request : 66.249.81.131  languagecode :  country : united states  region : california  city : mountain view  zipcode : 94043  latitude : 37.406  longitude : -122.079  timezone : -07:00  available  : \'http  areacode : 0  dmacode : 0  continentcode : na  currencycode : usd  currencysymbol : $  currencysymbol_utf8 : $  currencyconverter : 1  extended : 1  organization : null  

edit: solution:

got it! these 'crawlers' not crawlers part of live website preview used in google search engine.

i have tried this, show 1 of websites in preview , yes, there is, received blockedip message.

if want users able view preview of website, have accept these 'crawlers'.

like others said: "the root domain of url google.com , can't spoofed".

conclusion: can trust these bot's or crawlers , used show preview in google search.

i haven't confirmed, suspect these ips may associated google's data compression proxy google chrome mobile:

https://developers.google.com/chrome/mobile/docs/data-compression

if case, blocking them cause site display incorrectly innocent mobile users.


it may associated google+ crawler used grab snippets pages using google +1 button:

https://code.google.com/p/google-plus-platform/issues/detail?id=178

bottom line is, these ips used web requests kicked off internal google stuff. not public web proxies.


Comments

Popular posts from this blog

monitor web browser programmatically in Android? -

Shrink a YouTube video to responsive width -

wpf - PdfWriter.GetInstance throws System.NullReferenceException -