Solved

Avoid duplicated events ingestion from snowflake

  • 4 January 2024
  • 3 replies
  • 102 views

I’m ingesting data from Snowflake. During the POC we ran into some duplicate events getting loaded due to not having the right warehouse size combined with retrying loading. I believe it was mentioned that if we provided a unique identifier for an event during ingestion we could prevent that in the future if any retries occurred. I don't see an option for how to provide an event id to amplitude in the ingestion documentation though. Any ideas on how to add that in?

 

icon

Best answer by Saish Redkar 4 January 2024, 18:19

View original

3 replies

Userlevel 7
Badge +10

Hi @Haniel Eliab López 

You’ll have to send an insert_id field for each record. You can find the additional supported fields in the HTTP docs - https://www.docs.developers.amplitude.com/analytics/apis/http-v2-api/#keys-for-the-event-argument

Thanks @Saish Redkar ,

Just to validate. Does this insert_id also applies for the snowflake ingestion? We are not using the HTTP v2 API
Thanks!

Userlevel 7
Badge +10

Yes, their Snowflake ingestion is based on the HTTP API underneath.

We send the insert_id along with the other requisite fields/columns in our Snowflake Ingest config SQL.

Reply