Ask Me Anything about CDI with Arpit Choudhury, Founder at astorik

  • 28 February 2022
  • 25 replies
  • 240 views

Userlevel 6
Badge +9
Join us on Tuesday, March 15th, 2022 for an AMA with Arpit all about CDI. 

Getting customer data infrastructure (CDI) right is key to a successful product-led growth strategy. Product and growth professionals need to understand CDI to derive insights from data efficiently and act upon those insights effectively.

To that end, Arpit Choudhury the founder of astorik (a community for folks exploring the modern data landscape,) is answering your questions about CDI! Please be sure to send in your questions for Arpit by posting to the comments. See you on March 15th at 9 am PT, which is when @arpitc will be back here to answer your questions live!  

Here are a few articles to review before the event. 

 

RSVP here! 

 


25 replies

Userlevel 3
Badge +1

Looking forward to this!

Userlevel 6
Badge +9

Same here @arpitc:grinning:

Userlevel 6
Badge +9

Hello! Before we get started with this event later this morning, I'd like to acknowledge the challenges faced by those currently impacted by violence, war, and the ongoing COVID-19 pandemic. It is so important to be gracious to ourselves and others during this time, and I appreciate everyone taking the time to be a part of this AMA conversation today.

We’ve gathered your questions and the AMA will be in Q1, A1 format. Folks are welcome to ask additional questions too. See you later! 

 

 

Userlevel 6
Badge +9

We’re going to get started with our first question in 5 min! :grinning:

Userlevel 6
Badge +9

Here we go! Q1: What is CDI and why is it relevant?

Userlevel 3
Badge +1

Hello! Before we get started with this event later this morning, I'd like to acknowledge the challenges faced by those currently impacted by violence, war, and the ongoing COVID-19 pandemic. It is so important to be gracious to ourselves and others during this time, and I appreciate everyone taking the time to be a part of this AMA conversation today.

We’ve gathered your questions and the AMA will be in Q1, A1 format. Folks are welcome to ask additional questions too. See you later! 

 

 


Thanks Jeremie for sharing this and I’m totally with you. It’s such a strange time and I’ve personally been struggling to stay productive -- just been glued to the news and hoping things get better soon. 

In any case, I’m glad to be doing this and look forward to providing unbiased answers to people’s questions. 

Userlevel 3
Badge +1

Here we go! Q1: What is CDI and why is it relevant?



A1: CDI is a purpose-built solution to collect behavioral data from primary or first-party data sources. Your core product (websites and apps) powered by proprietary code is a primary or first-party data source.

CDIs are extremely relevant today as more and more companies are trying to understand how users interact with their product and build personalized customer experiences, both of which can only be done if behavioral data is collected and then synced to analysis and activation tools.

Userlevel 6
Badge +9

Q2: How is CDI different from a CDP - customer data platform?

Userlevel 3
Badge +1

Q2: How is CDI different from a CDP - customer data platform?

 

A2: CDP is essentially a layer on top of a CDI that comes with a visual interface to combine data from multiple sources and build audiences to be synced with downstream tools. Additionally, CDPs also store a copy of the data it collects and offers out-of-the-box identity resolution.

Here's a simple formula to understand how CDI and CDP are related: CDP = CDI + Identity Resolution + Data Storage + Visual Audience Builder + Integrations to sync data to third-party destinations.

Here’s an interesting discussion on this exact topic.

Userlevel 6
Badge +9

Q3: Can a CDI be used to collect data from third-party tools?

Userlevel 3
Badge +1

Q3: Can a CDI be used to collect data from third-party tools?

 

A3: While not a core functionality, some CDIs do offer source integrations with a handful of third-party tools. It's good to keep in mind that even when two separate CDIs claim to support a particular data source, their capabilities might differ and the best way to understand what's possible is to read the docs.

Here is a list of third-party tools that Segment supports as data sources and here are all the integrations offered by Snowplow (only about 10 third-party tools as sources). 

Userlevel 6
Badge +9

Q4: What is the difference between a CDI and an ELT tool?

Userlevel 3
Badge +1

Q4: What is the difference between a CDI and an ELT tool?


A4: I get this question a lot, especially because there are vendors that offer both CDI and ELT capabilities.

Here’s how I like to explain the core difference: 

Your core product powered by proprietary code is a primary or first-party data source and CDIs are purpose-built to collect behavioral data from first-party data sources.

ELT tools, on the other hand, and purpose-built to extract all types of data from third-party tools (secondary data sources) and load the data into a cloud data warehouse where the data is eventually transformed and modeled for the purpose of analysis and activation. 

Userlevel 6
Badge +9

Q5: What are the benefits of using a CDI as compared to a custom tracking solution?

Userlevel 3
Badge +1

Q5: What are the benefits of using a CDI as compared to a custom tracking solution?


A5: My answer to this question is slightly opinionated.

Firstly, I'd say all the build vs buy arguments are applicable here and with the progress CDIs have made over the last couple of years, I'm definitely in the buy camp.

However, based on my own experience, the biggest benefit of using a CDI lies in the monitoring and data governance capabilities that are hard to replicate with a custom solution. I’d say unless companies have really large data volumes where the cost of readymade solution might be prohibitive, there's no good reason to build and maintain a custom tracking solution that is ought to break at some point and create data quality issues. 

Userlevel 6
Badge +9

Q6: Can a CDI be used for data activation purposes?

Userlevel 3
Badge +1

Q6: Can a CDI be used for data activation purposes?

 

A6: They certainly can. CDIs that support third-party tools as destinations can definitely be used to activate the data collected by the CDI and that too in real-time.

Segment's CDI offering, Connections can be used to sync and activate data in a broad range of third-party destinations. Most of the other CDIs primarily support data warehouses as destinations with a limited number of third-party integrations.

Userlevel 6
Badge +9

Q7: Do CDIs also store the data they collect?

Userlevel 3
Badge +1

Q7: Do CDIs also store the data they collect?

 

A7: Some do and some don't. CDI vendors that also offer CDP capabilities as add-ons (Segment and mParticle) generally store a copy of the data whereas CDIs focused on data warehouses as destinations don't.

Userlevel 6
Badge +9

Q8: If we use Segment Connections that offers third-party destinations, do we still need a Reverse ETL tool?

Userlevel 3
Badge +1

Q8: If we use Segment Connections that offers third-party destinations, do we still need a Reverse ETL tool?

 

A8: You need a Reverse ETL tool only if you want to build data models in the data warehouse before syncing those models to third-party destinations. Essentially, if you want to sync raw events to third-party tools for activation purposes (ideal for real-time use cases), you don't need a Reverse ETL tool.

Userlevel 6
Badge +9

Q9: How is collecting data using a product analytics tool like Amplitude different from using a CDI?

Userlevel 3
Badge +1

Q9: How is collecting data using a product analytics tool like Amplitude different from using a CDI?

A9: The process of collecting data is pretty similar and the primary difference lies in the destinations supported by CDIs vs a product analytics tool which itself is a destination and doesn't offer as many integrations with external tools. Pricing is another factor to consider too.

Userlevel 6
Badge +9

Thank you @arpitc for all of these insights. That’s all of the questions we have today.

This has been such a great conversation. Everyone, please be sure to explore Arpit’s latest writing on the Amplitude blog! :grinning:

Userlevel 3
Badge +1

Thank you @Jeremie Gluckman for having me, this was fun! I love answering questions about data and will be happy to answer future questions too. 

Also, folks who have questions about other data tools and technologies can get expert answers on the astorik Q&A community.

Cheers!

Reply