Skip to main content
Solved

Does Amplitude post process data?

  • September 6, 2023
  • 3 replies
  • 326 views

I’m attempting to run a segmentation report for the total number of a specific event for the month of august. I’ve noticed that the number jumped the following day after I Initially ran the report on 9/4. My question is, does Amplitude do some sort of post-processing or qa/qc? This might explain the different tally I’m getting. 

Best answer by JennRu

Hey @Mario Federis, a few follow-up questions:

  • If you group the event by Library Amplitude user property, which source are you ingesting the data from? Some upstream source vendors such as Segment offer automatic retries to queue up the data and batch send it to the destination. 
  • Did any filters change (event filters, or segment filters) upon inspecting the first report count on 9/4 vs 9/5?
  • If you duplicate the event on your event segmentation chart and set the following conditions on the event, does this event total count match the first event? (Did your org backfill any data for the month of August?) 
    • on the duplicated event, set the Server upload time Amplitude user property to ≥ 2023-08-01 00:00:00
    • and set the Server upload time < 2023-09-01 00:00:00

The above will help identify what the total count of events were for August that were also uploaded to Amplitude in August. 

View original
Did this topic help you find an answer to your question?

3 replies

JennRu
Community Manager
Forum|alt.badge.img+9
  • Community Manager
  • 75 replies
  • Answer
  • September 6, 2023

Hey @Mario Federis, a few follow-up questions:

  • If you group the event by Library Amplitude user property, which source are you ingesting the data from? Some upstream source vendors such as Segment offer automatic retries to queue up the data and batch send it to the destination. 
  • Did any filters change (event filters, or segment filters) upon inspecting the first report count on 9/4 vs 9/5?
  • If you duplicate the event on your event segmentation chart and set the following conditions on the event, does this event total count match the first event? (Did your org backfill any data for the month of August?) 
    • on the duplicated event, set the Server upload time Amplitude user property to ≥ 2023-08-01 00:00:00
    • and set the Server upload time < 2023-09-01 00:00:00

The above will help identify what the total count of events were for August that were also uploaded to Amplitude in August. 


  • Author
  • New Member
  • 2 replies
  • September 13, 2023

From what I can tell, I think the filters were probably changed is why the numbers were off the following day. 

But you brought something to my attention I never considered. If the datasource was coming from segment, would it have been better to use the ‘library’ and ‘server upload time’ user property for a more accurate number of events? I was unaware that a source could retry events if they were unsuccessful.


JennRu
Community Manager
Forum|alt.badge.img+9
  • Community Manager
  • 75 replies
  • September 14, 2023

Hey @Mario Federis , thanks so much for confirming there may have been filter changes across the 2 charts, which resulted in different counts

 

I took a look at Segment’s doc on retries to Destinations, and it states the following 

Segment’s internal systems retry failed destination API calls for up to 4 hours 

Given it’s just a 4 hour window, I don’t think it’s critical to set the server upload time when pulling event counts, but there certainly could be some latency in data ingested due to this. 

I generally lean on including server upload time in event queries when the company is backfilling data into the project to differentiate between the data ingested during a certain time window vs data timestamped during a certain time window. 

 

I hope this helped!   


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings