Hi @bhill
I’m interpreting that labels and values in your question refer to event names, their properties and values.
Having a consistent nomenclature for your events and properties is one of the most important pillars of your data governance strategy. It’s a luxury to have that nailed down before starting data collection, but it’s not always the case with every implementation.
The most common reason could be that the different applications tracked in a codebase usually have distinct functionalities/experiences + non aligned implementation timelines, which causes the naming conventions to managed in a siloed and chaotic manner.
Since ingested data in Amplitude is immutable, the usual fix would be renaming the events according to the newly agreed upon naming conventions on the UI ( Data ) level. This won’t change the raw data and will have to be done on the individual project level.
The Snowflake data export seems to be a good approach if you are planning to clean up your events/property naming by transforming them in Snowflake warehouse and then reingesting these events (you can set up a reverse Snowflake import source or use the Batch Event Upload API ).
I’m assuming your team is okay to start reporting from fresh in the new projects where this cleaned data will be ingested if taking the Snowflake export /import approach.
Here are a few steps to keep in mind while taking the above approach
- Audit your existing events and their naming across all your projects and see where you can bridge the gap. This can be done by clubbing together events which point to the same functionality and experience in your product/s and using event properties smartly
- If each of your project is a distinct app in your product portfolio, introduce a property which can attribute the events to that app.
- Amplitude offers a Cross project analysis view paid feature if you want to perform analysis of all your projects in one place. Point #2 could prove helpful if you decide to try this feature out. Else, there is always a projectID property attached to your events while doing cross project views.
You can also read through this taxonomy guide for best practices.
Hope this helps.
@bhill Just to confirm, we are performing a similar approach and as @Saish Redkar indicated, we have a set of standards that all teams must conform to in a development project before they release their QA to a production project. This approach allows us to validate conformance before they begin streaming data that will eventually hit our warehouse.
This way, we really just have to update keys once we’re through the QA process without adding more time to developers plates. The big addition we’re after now are better tools to route schema validation issues back to developers who are ultimately responsible for the fixes. QA’ing data is still a fairly manual process even with the govern/data features.