Solved

Search spider requests are being logged (Yandex)

  • 11 February 2022
  • 2 replies
  • 706 views

  • New Member
  • 0 replies

I’ve noticed a strange behaviour - we’ve started to receive lots of events (around 40K per day) from IP’s belonging to Yandex (russian search engine), and it looks like that their spiders run javascript code and trigger a corresponding events. And it’s ruining our analytics :(

This happens only with Yandex spider. Do you have any idea why this happened and how can we fix it

 

PS I know, that I can ban events coming from a certain IPs, but I think that the ignoring search bots should be done automatically.

 

icon

Best answer by sydney.koh 16 February 2022, 00:49

View original

2 replies

Userlevel 5
Badge +8

Hi!


Currently, we only block Google’s web crawler. It identifies itself in the User-Agent string with “Googlebot”. We don’t block anything else automatically. I recommend blocking and filtering by IP Address since we do not filter or block from that search engine at this time. More information on how to block/filter data can be found here: https://help.amplitude.com/hc/en-us/articles/360016338212#h_88c3bdf1-84fd-4e14-8d00-c540d1596569

 

There are some other options that could help you but would involve changing your implementation:
 

  • Sending data into Amplitude server-side is the most direct alternative. 
  • Some customers have also routed their data through a proxy server before sending the data to Amplitude. Some documentation on how to set that up can be found here. This will remove some end user data  (remove originating IP address, location, userID, etc)

Best,

Sydney

Badge

I have same issue russian search engine on this webite https://apkomex.com/ but the sydney.koh reply solved my issue. thanks

Reply