Skip to main content

I’ve noticed a strange behaviour - we’ve started to receive lots of events (around 40K per day) from IP’s belonging to Yandex (russian search engine), and it looks like that their spiders run javascript code and trigger a corresponding events. And it’s ruining our analytics :(

This happens only with Yandex spider. Do you have any idea why this happened and how can we fix it

 

PS I know, that I can ban events coming from a certain IPs, but I think that the ignoring search bots should be done automatically.

 

Hi!


Currently, we only block Google’s web crawler. It identifies itself in the User-Agent string with “Googlebot”. We don’t block anything else automatically. I recommend blocking and filtering by IP Address since we do not filter or block from that search engine at this time. More information on how to block/filter data can be found here: https://help.amplitude.com/hc/en-us/articles/360016338212#h_88c3bdf1-84fd-4e14-8d00-c540d1596569

 

There are some other options that could help you but would involve changing your implementation:
 

  • Sending data into Amplitude server-side is the most direct alternative. 
  • Some customers have also routed their data through a proxy server before sending the data to Amplitude. Some documentation on how to set that up can be found here. This will remove some end user data  (remove originating IP address, location, userID, etc)

Best,

Sydney


I have same issue russian search engine on this webite https://apkomex.com/ but the sydney.koh reply solved my issue. thanks


Thank, you solved my problem  I also finding it for a long time. its a great problem solving post. 


Reply