Skip to main content

Hi, I have a question regarding an issue we noticed with user being identified (user personal data, such us email, name, surname and phone number are visible under the userId) after he/she chooses to be opted out for analytics (data collection), as per the GDPR regulation the user has to have a possibility to choose to be opted in or opted out for analytics.

Once user registers and chooses to be opted out for analytics, the anonymousId is generated and the user is not identified. If users decides to opt in for analytics, the userId is generated and user is identified (we have visibility on his personal data, such us email, name, surname and phone number...). After some time, if user chooses to be opted out, the user personal data is still visible, I am guessing that is because the amplitudeIds and deviceIds are being merged under one userId. 

is there a workaround, for the user personal data to be hidden or deleted after the users chooses to be opted out for data collection?

Hi @Mateja Tomek , I found the same challenge with the way Amplitude merges users based on deviceID and userID. The solution we are going with is to apply an anonymous userID when a user opts out.  You could…

  1. set a new value for a user’s userID each time a user opts out.
    1. Amplitude will recognize the different userID and apply a different Amplitude ID which is what is used to count unique users in their charts.
    2. any user properties associated to the new userID will not be merged with the user’s properties that are associated with their previous userID.
    3. unique users can still be counted, but will likely be over counted since the user will have multiple anonymous userID over time.
    4. all activity will also not be tied to subsequent activity if they choose to opt out again. making them truly anonymous.
  2. set the new value for a user’s userID to a single anonymous user
    1. all anonymous activity from all users who have opted out will be aggregated under one user and so any single person’s activity will be difficult to follow.
    2. this one user can be easily removed from charts where you want to understand behavior of your users who have opted in.
    3. these users will not be individually counted as part of unique user counts, so unique user counts will be underestimated.
    4. building cohorts based on behavior will be difficult since one user under the anonymous userID might have done an activity that ends up bucketing all users under that ID.
  3. set a new value for a user’s userID one-way hashed from the device ID or some other unique property of the user.
    1. provides a unique user ID that remains consistent over time
    2. the “opt-out” userID is not and cannot be associated with the user’s “opt-in” activity.
    3. the user will technically have multiple user ID so counts of unique users will be inflated based on the number of users who have both opted-in and opted-out.
    4. you can understand user behavior of the opt-out users more easily and even compare that to the behavior of the opt-in users.
    5. you are still technically collecting data on this person’s activity even if it cannot be tied back to them, so be mindful not to capture anything that could be PII even if the userID is anonymized.

I like Option 3, personally.


Thank you @CCrowley for this detailed overview to help resolve this. We appreciate your contributions to this community! 😃


Hi @CCrowley,

Thank you for detailed overview of the possible solutions, it is much appreciated. :) 

We are currently brainstorming on which approach could work for us, most likely we will go with the option 3, as well. 

I am bit confused with the below, you are saying that a user will have unique user ID, but then you are saying that user will have multiple user IDs

  1. provides a unique user ID that remains consistent over time
  2. the user will technically have multiple user ID so counts of unique users will be inflated based on the number of users who have both opted-in and opted-out.

When you say a unique user ID, you mean that a user will have one unique user ID when he is opted-in and another unique user ID when he is opted-out, is my understanding correct? 

Also, I am bit worried about the user count, as it will be in inflated. Did you find any solution on how to count user as one user in a situation we have a user who has both opted-in and opted-out? 

Thanks!
Mateja 


When you say a unique user ID, you mean that a user will have one unique user ID when he is opted-in and another unique user ID when he is opted-out, is my understanding correct? 

@Mateja Tomek , you are correct.  With Option 3 a user would have a unique user ID (one that is not associated to any other people) but because the intention is not to tie their opt-in and opt-out activity that user would need two separate userIDs.  This does mean that for all users who use your product in both states (opt-in and opt-out) will be counted twice.

Unfortunately I don’t see a way around this, because if you were to know they are the same person in both states then the opt-out is not really anonymous and goes completely against the goal of GDPR.

You may want to consider how necessary it is to have a precise count of unique users. Once you have a process in place for defining users if it remains consistent then you can still make use of unique user counts to see general trends. The assumption is that over time the frequency of new users having both an opt-in and opt-out userID would not change.

You may be able to use DeviceID to estimate unique user count (but this assumes that users only use one device to interact with your product).

If you do feel you need to know the portion of users who have userIDs for both states, it may be possible to capture a user property that indicates the user had at some point opted-in with just some boolean value that cannot actually be tied to the specific userID of the user in the opted-in state.

 

I hope this helps!


Reply