[Retention] How is Retention Calculated? (N-Day Retention)

  • 18 January 2021
  • 0 replies
  • 594 views

Userlevel 4
Badge +3

The data points you see in the Retention chart are the deduplicated “weighted averages” of all Day N retention values and should match the percentages located in the top row of the breakdown table. It is often easier to digest retention data in Amplitude by looking at the expanded breakdown table below the chart. In the breakdown table, the asterisk means that the day is incomplete for that cohort and the percentage may change as more data comes in. Percentages with asterisks are not calculated in the average unless it is the only data point for that day.

For each retention percentage in the data table, the numerator is the number of unique users who performed the return event on their Day/Week/Month N. The denominator for all rows - except the top "All Users" row - is the user count you see in the "Users" column,​ which represents the number of users in that cohort who performed the Starting event that day. Since users can perform the Starting event multiple times in the chart date ranges, some users may be counted in multiple rows.

To explain how the top row of retention rates are calculated, let’s use an example. Let's say we have the following data table that is produced when we select the same event for the Starting and Return event (Starting event is not '[Amplitude] New User'):

Ta9-r_Ca6er-9eErJ0EJ5LBvjRHr_qTFe9lBxlt4zqcTNL_S2CPuCFJe8UlyynIr-CiC7qqRp5mB5aJBBMJq39AflvY5AzlQtCtXVXZwknCjrQo-pRnGH7LR61Xv4apxTh_TaQ2r

  • Jan 6: Users A and G fire start event of 'Play Song', 24 hours later only User A fires return event of 'Play Song'
  • Jan 5: Users A and F fire start event of 'Play Song', 24 hours later only User A fires return event of 'Play Song'
  • Jan 4: Users A and E fire start event of 'Play Song', 24 hours later only User A fires return event of 'Play Song'
  • Jan 3: Users A and D fire start event of 'Play Song', 24 hours later only User A fires return event of 'Play Song'
  • Jan 2: Users A and C fire start event of 'Play Song', 24 hours later only User A fires return event of 'Play Song'
  • Jan 1: Users A and B fire start event of 'Play Song', 24 hours later only User A fires return event of 'Play Song'

There are a total of 2 unique users in each row (cohort). For each cohort’s Day 1, only User A fires the 'Play Song' event while Users B, C, D, E, F, and G churn and do not perform the return event of 'Play Song'. Thus, 50% of the users in each cohort are retained at Day 1.

Now if we look at the top row (All Users), we are looking at all the unique users between Jan 1 - Jan 6. This is 7 unique users (A, B, C, D, E, F, and G). So, for Jan 1 - Jan 6, the overall retention for Day 1 is 1/7, or 14%, because User A is being deduplicated in the calculation.

In summary, the rolled up Retention percentages is calculated using the deduplicated count of users who performed the Return event out of the deduplicated count of users who have completed Day N.


0 replies

Be the first to reply!

Reply