Skip to main content

Alex Magnusson is a leading expert in the field of product analytics, specializing in helping companies leverage the power of Amplitude to make data-driven decisions. He is the co-founder and CEO of Magnusson Analytica, a consultancy that provides Amplitude implementation, training, and strategic guidance to clients worldwide. If you want 1:1 help and a free chat, book time with Alex here!

In this Amplitude Messy Data Cleanup Hour, we covered the essential steps for maintaining clean, reliable, and actionable data in Amplitude. Including common challenges such as overwhelming event lists, inconsistent naming conventions, data validation, and the ongoing task of aligning tracking with evolving business objectives.

 

For more help, and to chat with Alex and other users, join our Slack community!

 


 

Amplitude Data Cleanup Checklist

Here's a handy checklist to help you audit, clean, and govern your Amplitude data:

I. Establish a Data-Driven Mindset:

II. Conduct a Thorough Data Audit:

III. Implement Robust Data Governance:

IV. Clean Up and Optimize Event and Property Structure:

  • Consolidate Redundant Events: Merge events that capture very similar actions, using event properties to distinguish variations instead of creating separate event types.
    • For example, instead of having separate events for "View Product Page," "View Homepage," and "View Checkout Page," use a single "View Page" event with a "page_name" property to specify the page being viewed.
  • Merge Inconsistent Properties: Use Amplitude's Transform feature to merge properties that have different names but represent the same data point. This will ensure data consistency and simplify analysis. Be sure to delete the old, inconsistently named properties once you have enough historical data using the new, consolidated property.

V. Prioritize the Amplitude Tracking Plan:

  • Collaborate with Developers Early: Work closely with developers before they start writing tracking code to make sure they understand your naming conventions, expected data types, and other requirements outlined in your tracking plan. 
  • Enforce Data Type Consistency: Be explicit about the data type (e.g., string, integer, boolean) expected for each property in your tracking plan. Communicate these requirements clearly to your developers to prevent errors that can arise from mismatched data types.

VI. Test New Tracking in a Development Environment:

 

Timestamped Version:

  • 0:00 - Introductions and Session Overview
  • 2:00 - The Importance of Regular Data Cleanup
  • 3:45 - Too Many Events: Streamlining Your Event List
  • 10:45 - Inconsistent Naming: Standardizing Conventions and Enforcing Them
  • 15:50 - Finding Incorrect Data, and what to do
  • 18:30 - Alignment with Objectives (plus taxonomy spreadsheet)
  • 20:38 - Bugs and Errors: Identifying and Fixing Tracking Issues
  • 22:35 - Get 1:1 help or help from the user community
  • 24:58 - Question: I have this page view event, but it doesn't contain enough detail, so I end up adding another event for specific page view product page view. Is this correct?
  • 26:06 - Question: Should I keep the page view event?
  • 28:43 - Question: How do you fix errors when properties are passed through in different formats (e.g. same property gets values passed through boolean and number)?
  • 32:13 - Question: How do you get devs to follow a specific naming convention?
  • 36:46 - Queston: Can you show how to easily fix that when you have the Boolean versus string or I have platform Android, capital a, lower a, and so on and so forth. And is it possible to generalize it?
  • 40:02 - Question: How do you fix raw event data and then whether you recommend exporting it all and then fixing them manually or being able to fix them, importing the fixed data into Amplitude?
  • 46:50 - Question: We have inconsistent naming for properties and would like to merge all camelCase and snake_case properties to be considered snake_case. With >100 properties, is there an easy way to do this?
  • 49:25 - Question: My question is about if there's a way to make, like, a formula or a calculation between two events that have the same property and how to calculate, for example, the conversion rate, between two events holding one property constant 
  • 54:30 - Question: Would you recommend the same pattern you explained for handling Forms? So Forms Started, Forms Submitted and with a property defining the Form Name. The target would be to build a funnel and group by Form Name to look at the conversions of different forms
  • 57:11 - Question: So how do you find a good balance between the two in maintaining this type of architecture for, like, this clean cleanliness within the events, but then also be able to derive insights that may not be the focus for the quarter?
Be the first to reply!

Reply