Solved

Big Query ingestion issue


Badge

Hi, 

We recently integrated Amplitude in our app. We have a lot of older events data which we also wanted to ingest so as to avoid a cold start. We linked out big query and after running the query we got a success feedback and no errors. However, we can’t see any events in our dashboard. Why might this be happening and how do we fix this? 

icon

Best answer by eddie.gaona 13 July 2023, 21:09

View original

14 replies

Badge

 

Badge

@Saish Redkar @Yuanyuan Zhang @timothy-permutable @wlaubernds hey, can either of you help with this? 

Userlevel 7
Badge +10

Hi @Aman Kedia 

I haven’t used the BigQuery data source, but here’s how I debug my Snowflake data source ingestion

  • The ingestion jobs log history is a good place to start. If you see a failed job, then you know where to look for in your pipeline. But since your screenshot says the recent jobs have been successful, I would further look into the “General” Tab of your BiqQuery source.
  • Under the “General Tab”, look at “Daily Events Uploaded by Source (in UTC)”. Sometimes the ingestion jobs would be successful if all the rows specified get ingested. But if your source table itself has less rows then the you’ll see a drop in data under the daily events uploaded chart.
  • If all the above looks good, then you would want to check your Events under tracking plan. Verify if those newly ingested events appear under “Unexpected”. If they are listed under this, you need to add them to your tracking plan. It would also worthwhile to check if the newly ingested events are part of your blocked event list just in case

    Once you make the required changes ( if needed ), you should see your ingested events in your chart dropdowns.

Hope this helps.

Badge

Hey @Saish Redkar thank you for responding. However, I checked all this and nothing seems off. 

 

  1. The tracker in General tab is a flat line. So the ingestion pipeline hasn’t worked at all. 
  2. There are a few events under “Unexpected” tab
    1. These are all from a different source (app)
    2. Can you share some resources so I can read up more on what these unexpected events are? 

But as things stand, out big query pipeline isn’t ingesting any data in Amplitude. 

Userlevel 5
Badge +5

Hi @Aman Kedia 

  1. Is there no events ingested by BigQuery at all? or only very few are ingested?
  2. Are you using full time sync or time-based import? 
  3. As Saish said, you should also check your tracking plan. Did you set the schema to reject unplanned events?
  4. Are you under the 2000 event type limit? 
Badge

Hey

  1. No events ingested at all (screenshot attached)
  2. Time based import. The other option isn’t applicable for this use case (we can’t even select it. it’s in disabled state)
  3. How do I check that? And had that been the case, they would come up in “Review Unexpected” section, right? 
  4. Yes, only 441 events. 

     

Userlevel 7
Badge +10

@Aman Kedia 

If the ingestion was successful, these new events would appear as unexpected in your tracking plan. Unless, the “reject unplanned events” action is selected. But it looks like the daily uploaded events by this source is empty.

If you had this setting, you would have received a notification of the event being rejected

 

One the pipeline side, can you verify that your SQL is using the correct schema and table from BigQuery?

As mentioned earlier, if the source table has empty rows and your connection is querying the table in sync, your logs would always indicate success.

Just linking in the BQ import docs here just in case https://www.docs.developers.amplitude.com/data/sources/bigquery/?h=bigqu

Badge

Wouldn’t the upload show error if there was a schema issue? Before finishing the set up, it shows a sample of 100 rows which seemed fine (refer to point number 7 in this link you have shared). So I’m unsure about schema being incorrect. And as per this doc, this should have been the final check and ingestion should happen if everything till this step was fine. 

Userlevel 7
Badge +10

You might want to raise a support ticket with the Amplitude Team since the most common items in the debugging checklist seems to be checked out. The Support Team will be able to analyze your ingestion logs in a detailed manner and pinpoint what’s the exact cause.

@Esther Trapadoux can help convert this into a support ticket.

Badge

@Esther Trapadoux Please help us with this. 

Userlevel 5
Badge +8

@Aman Kedia sorry about the long delay!! Can you actually file the bug request here? That way the team can properly track and follow up with you. 🙏

Userlevel 5
Badge +9

@Aman Kedia Hope all is well!

We were able to look into your setup and you’re using `CURRENT_TIMESTAMP` for the time based import. This is the reason why there isn’t data within your project.

We advise the following within the UI when setting up the import:

Timestamp column required for time based import only, usually server upload time or database insertion time, must be in TIMESTAMP. In other words, provide a valid created_at timestamp column to use time-based import. CURRENT_TIMESTAMP or NOW is not a valid value.

 

The documentation on time-based imports can be found, here.

Userlevel 7
Badge +10

Thanks for jumping in @eddie.gaona !

Badge

Hi, 

@eddie.gaona thanks for helping out! Followed your advise and was able to resolve the issue. We can ingest events now. Thanks a lot! 

Reply