Skip to main content

Hi, I have an integration between segment and amplitude. 

One problem I am seeing is lets say we do two calls “Identify” and “Track”. We expect the track call to pick the user properties from identify. However, this is not happening for all of the events, some events do not have any user properties set previously via identify. 

When we do our analysis on the “Track”, grouped by user’s properties “country” for example, we are seeing AU, US , etc.. and None. None should NOT be there. (Edit: clarifying, we set country info on all our users)

Any idea why this is happening and how we can fix this? 

 

Regards,

Raymond

Thanks for posting here @raymondag. This similar thread should help as you investigate here. Please keep us posted if you need additional support. 

 


Thank you @Jeremie Gluckman. That links are useful. From my investigation, It looks like segment does  not preserve the call order of `identify` and `track` although from our end, we are calling `identify` first.

 

Any suggestions/advice from your end on how I could remediate this issue?

 


Thanks for circling back. I’ll escalate this to our support team, who can take a closer look.


Hi @raymondag

It sounds like you’re referring to a classic ordering and timing issues that customers who are using Segment faces when sending data from Segment to Amplitude. It is the case that the calls need to be sent in a specific order so we can ingest the calls in that order and thereby apply the identify and group calls to the events accordingly. Historical data cannot be retroactively updated therefore user properties and group identifies are only applied to events that come after the identify and group calls. 

Please allow me to share some information about this --

Here’s statement about how we order/process events:
 

We do something more/better than eventual consistency. We guarantee in-order processing on a per-device-id basis.

For 1 specific device id, if we receive an identify event and then an event, we guarantee that the user properties on the identify event will show up on the event. The fact that they’re not seeing that means mParticle is not sending the events to us in order. The customer might be triggering identify and events in the order that they want, but depending on how mParticle built their integration with us and sends data to us, it might not be in order.
This is the same issue customers have if they use Segment.

If they use our SDKs, our SDKs always send calls in order on the device, so that won’t be an issue.

 

I don’t think I have a way of verifying this but the simplest explanation does seem to be that we’re just getting that identify after the event.

We guarantee ordering of events and identifies for each device_id from the point we receive it. (see above.) There’s not much we can do if someone in between us and the customer (such as Segment) aren’t providing the same guarantees.


The Engineer who stated the first quote gave some further context about Segment's situation: 

Segment is not guaranteed to send us events in order. This is a known issue. I’ve heard customers say that Segment’s workaround is to tell the customer to first send the identify event / set the user properties, wait like 10 seconds (in code), and then log the event, to guarantee a higher success rate of logging the stuff in order.

The reason their events are not guaranteed in order is simple. Their events might be stored in order, like this: identify1, event1, event2], however they may parallelize the uploading of events to Amplitude.
Like this: worker1: didentify1], worker2 oevent1, event2]. And worker 2 for whatever reason might succeed in sending the data to us before worker 1. And so as long as they are parallelizing the upload to us this way (which they don’t have any plans to change), this issue will remain.


Hope this helps! Please let me know if you have any questions. 


Thank you @jmagg ! for the detailed explanation. That helps. 


Reply