Solved

Total of UTM medium visitors not equal to total visitors


Badge

Hi there, 

When I try and track where all my page visitors are coming from by grouping by utm_medium, I find that summing of the number of unique visitors from across all mediums (including (none)) gives a total which is far below the total number of unique visitors to the site.

Please can you help explain if I have done something incorrectly or why this is?

Thank you!

icon

Best answer by Yuanyuan Zhang 6 June 2023, 10:49

View original

11 replies

Userlevel 5
Badge +9

This is completely expected behaviour @B Analytics . It’s perfectly possible for a single unique user to have mutiple utm_medium values (same goes for other user properties)...for example, I could go to my website from a PPC ad, then come back again later from an affiliate link. When you create a table with the utm_medium values as rows, that single user will count in multiple rows, so the sum of the rows won’t add up to the (de-duplicated) grand total.

Dan.

Userlevel 5
Badge +9

Apologies….ignore that, I’ve just noticed you said the sum of the rows is below the grand total. The comment I just made applies the other way around 😀

Userlevel 5
Badge +9

Funnily enough I’m seeing the same behaviour….I’ve used a breakdown by Platform in full knowledge that we currently only push “Web” into Amplitude.
 

Same as you, you can clearly see the sum of the rows (only one row in this instance) is less that the grand total. Which makes zero sense here, they should be the same.

Hopefully someone from Amplitude can clarify…@Denis Holmes perhaps?

Badge

Thanks @dangrainger , agree I’m very puzzled by this! @Denis Holmes - would love any help you’re able to provide!

Userlevel 5
Badge +5

@B Analytics @dangrainger Let me clarify this with the engineer and update you here!

 

Badge

Thank you @Yuanyuan Zhang - that would be much appreciated!

Userlevel 5
Badge +5

Hi @B Analytics @dangrainger

 The engineer replied that this is due to caching. The overall result and the groupby results are different queries, so they might be cached a bit differently. Another piece of technical detail is that for the last 24 hours of data, we hit our real-time service (in addition to the regular batch service), this can also lead to additional discrepancies depending on how each part of the query is cached.

For example, you will notice that the results are different if the end day is set to today, but if you update the end day to yesterday, then the results would be consistent.

I hope this clarifies!

 

Badge

Hi @Yuanyuan Zhang . Thank you for asking the engineer. However, the discrepancies are vast - I don’t think they are just the error of data being stored slightly differently. E.g. on 50+ views I get data on <10 views in total from UTM mediums including (None). Am I creating my chart incorrectly? 

Many thanks!

Userlevel 5
Badge +9

Thanks @Yuanyuan Zhang. It does clarify things…but then it also raises a follow-up! Discrepancies like this between totals and rows inevitably lead to only one thing when people see data in the tool…cries of “it’s wrong”, or worse still, “the tool must be broken”. Which is of serious negative consequence when building a business’ trust in data. Is there a way to avoid this occurring? Is there an item on the roadmap to prevent it happening in the future?

Userlevel 5
Badge +5

Hi @B Analytics could you please check if all values are included by click on the funnel icon in the overall cell?

Hi @dangrainger, I fully agree this is not ideal. There is currently no way to avoid this other than setting offset for 1 day. As far as I know, there is no plan to improve it. I will submit a feature improvement request to the Product team on your behalf.  

As always, I really appreciate your feedback!

Userlevel 5
Badge +9

Appreciated @Yuanyuan Zhang ...last thing I would want to see is a tool as good as this lose credibility due to such things! :-)

Reply