Solved

Amplitude not filtering out search crawler (Petalbot/petalsearch)

  • 12 January 2023
  • 3 replies
  • 248 views

Hi there,

 

I noticed that our Amplitude events were showing a large number of events, particularly in December, from an apparent search bot. Reverse-lookup of the IP pointed to PetalBot from petalsearch.com at IP range 114.119.*** 

Has anyone else seen this or figured out how to fix? I’m wondering if this is something that should be filtered out at the Amplitude library level or has been fixed in a future version already. 

Thanks!

 

icon

Best answer by Saish Redkar 13 January 2023, 05:32

View original

3 replies

Userlevel 7
Badge +10

Hey @jacobsimon 

I’m not aware of any configuration settings on the Amplitude SDK level which help block a certain IP.


The most common solution for this would be to implement a block filter in the Data/Govern part using the specific IP address.

This works well if the number of IP addresses to be blocked is at a manageable level.

If you are sending data via HTTP API, then maybe a middle layer can filter out the IP captured and not forward events to Amplitude . This can help manage the blocking part on your end itself.

 

Hope this helps.

@Saish Redkar Thanks Saish! I’ll take a look at blocking these particular IPs, that’s a good idea.

 

I’m curious if Amplitude would look into this on their end too - presumably, they are already filtering out known web crawlers and bots at some level and would just need to add this to their list of ones to check for.

 

 

Userlevel 7
Badge +10

@jacobsimon 
Found this post which could be helpful

 

Reply