I’ve noticed a strange behaviour - we’ve started to receive lots of events (around 40K per day) from IP’s belonging to Yandex (russian search engine), and it looks like that their spiders run javascript code and trigger a corresponding events. And it’s ruining our analytics :(
This happens only with Yandex spider. Do you have any idea why this happened and how can we fix it
PS I know, that I can ban events coming from a certain IPs, but I think that the ignoring search bots should be done automatically.
Best answer by sydney.koh
Hi!
Currently, we only block Google’s web crawler. It identifies itself in the User-Agent string with “Googlebot”. We don’t block anything else automatically. I recommend blocking and filtering by IP Address since we do not filter or block from that search engine at this time. More information on how to block/filter data can be found here: https://help.amplitude.com/hc/en-us/articles/360016338212#h_88c3bdf1-84fd-4e14-8d00-c540d1596569
There are some other options that could help you but would involve changing your implementation:
Sending data into Amplitude server-side is the most direct alternative.
Some customers have also routed their data through a proxy server before sending the data to Amplitude. Some documentation on how to set that up can be found here. This will remove some end user data (remove originating IP address, location, userID, etc)
Currently, we only block Google’s web crawler. It identifies itself in the User-Agent string with “Googlebot”. We don’t block anything else automatically. I recommend blocking and filtering by IP Address since we do not filter or block from that search engine at this time. More information on how to block/filter data can be found here: https://help.amplitude.com/hc/en-us/articles/360016338212#h_88c3bdf1-84fd-4e14-8d00-c540d1596569
There are some other options that could help you but would involve changing your implementation:
Sending data into Amplitude server-side is the most direct alternative.
Some customers have also routed their data through a proxy server before sending the data to Amplitude. Some documentation on how to set that up can be found here. This will remove some end user data (remove originating IP address, location, userID, etc)
If you don't have an Amplitude account, you can create an Amplitude Starter account for free and enjoy direct access to the Community via SSO. Create an Amplitude account. You can also create a Guest account below!
If you're a current customer, select the domain you use to sign in with Amplitude.
If you don't have an Amplitude account, you can create an Amplitude Starter account for free and enjoy direct access to the Community via SSO. Create an Amplitude account. Want to sign up as a guest? Create a Community account.
If you're a current customer, select the domain you use to sign in with Amplitude.
We use 3 different kinds of cookies. You can choose which cookies you want to accept. We need basic cookies to make this site work, therefore these are the minimum you can select. Learn more about our cookies.