Avoid duplicated events ingestion from snowflake

  • 4 January 2024
  • 3 replies

I’m ingesting data from Snowflake. During the POC we ran into some duplicate events getting loaded due to not having the right warehouse size combined with retrying loading. I believe it was mentioned that if we provided a unique identifier for an event during ingestion we could prevent that in the future if any retries occurred. I don't see an option for how to provide an event id to amplitude in the ingestion documentation though. Any ideas on how to add that in?



Best answer by Saish Redkar 4 January 2024, 18:19

View original

3 replies

Userlevel 7
Badge +10

Hi @Haniel Eliab López 

You’ll have to send an insert_id field for each record. You can find the additional supported fields in the HTTP docs -

Thanks @Saish Redkar ,

Just to validate. Does this insert_id also applies for the snowflake ingestion? We are not using the HTTP v2 API

Userlevel 7
Badge +10

Yes, their Snowflake ingestion is based on the HTTP API underneath.

We send the insert_id along with the other requisite fields/columns in our Snowflake Ingest config SQL.