Question

Identity Resolution of Off-platform Events

  • 8 March 2024
  • 3 replies
  • 48 views

Amplitude is a CDP. As a CDP, Amplitude is responsible for ensuring it’s platform can consume multiple data sources to form an accurate picture of the customer. The primary function of the CDP is identity resolution for these various sources. There is a very common use case that I can’t seem to figure out with Amplitude. Consider this example:

A company decides to hold a webinar hoping it will create engagement that increases conversion. Leads signup to the webinar on a separate platform e.g. Zoom or RingCentral. After the webinar, the company wants to know how attending the webinar impacted conversion. They are able to export CSV of the list of attendees with their name, phone number, and email address. They take their CSV and upload it to S3 to utilize the native Amplitude S3 Integration.

Upon setting up the S3 as a source, there’s a problem. There is no device ID, or user ID in the CSV since these leads didn’t sign up on a platform where Amplitude could collect these.

I’ve tested this example without success. I generated a random device ID in the CSV data and assumed that Amplitude would perform Identity resolution on the supplied email and phone user properties. Amplitude did not recognize phone or email as a unique user property and will not merge users based on these fields. Can you please help me understand how to properly identify these leads into Amplitude with their email address or phone number?

Note: I have already reviewed the following guides:

https://help.amplitude.com/hc/en-us/articles/115003135607-Track-unique-users

https://amplitude.com/blog/identity-resolution-insights

https://www.docs.developers.amplitude.com/data/sources/amazon-s3/#create-amazon-s3-import-source

https://www.docs.developers.amplitude.com/analytics/apis/aliasing-api/


3 replies

Userlevel 7
Badge +10

Hi @dargelies 

Identity resolution in Amplitude is currently based on just the device id / user id. I don’t think it will work based on just email and phone. If these are user properties stored on an user in Amplitude, I would :

  • reconcile the email and phone user properties and associate the user_id if applicable.
    • if all attendees of this webinar are using your app with an assigned user id, you should have a 100% coverage here. If not, the non user attendees will be anonymous users
  • create an event for the webinar attendance with the appropriate ingest schema and attributed to the correct user_id

Hi Saish,

Thank you for the reply. Yes, I’ve tested this and Amplitude does not reconcile based on just email and phone number.

The users are not in our system and do not have a user id. They are leads. We don’t believe it’s a good practice to create a user in your database for every lead that happens to show some interest in your product (consider every person that lands on your website). We create them after they’ve followed through with the intent to become a customer. Then it makes sense to create a user in our database and assign an ID to them.

From what I understand the non-user attendees are not created as anonymous users in Amplitude. They cannot be ingested at all, because there is no device ID or user ID associated with them. The entire S3 integration fails. I believe they should be created as anonymous users in Amplitude and then reconciled when they become identified users at the conversion event based on their deterministic matching identifiers: email and phone number.

Am I understanding that Amplitude is not capable of reconciling users when given deterministic matching identifiers? If so this seems like a pretty large gap in the functionality of a CDP, this blog post discusses how Amplitude uses deterministic identifiers for reconciliation.

Hi @dargelies  - apologies for the late reply but wanted to make sure you got an answer. Amplitude does indeed use deterministic identifiers for resolution. You may have noticed in this article that we merge on user ID only. Unfortunately that means that unless you are willing to set phone number or email as the “user ID”, users will remain anonymous (and unmerged).  However, you can hash the email and set it as device ID, if you do not want to set email as user ID, and have the phone number and email as user properties. This will require you to add another column to your CSV but should solve your problem.

A consistent hash will allow for anonymous behavior to stay merged (say if they attend 3 different webinars). This approach will also allow you to maintain consistency with your internal systems and not assign a true User ID until they follow through. 

 

Hope this helps!

Reply