Presently, the schema validation errors feature has only limited utility as the only three methods to see this data are
- In the event plan
- via email
- via slack
The biggest issue with this is that for my use cases, the developers responsible for resolving the work are rarely in the amplitude event plan, and as they typically handle their work via sprints where their Amplitude work is blended with other development needs. It’s not helpful for me to route the email notifications to them to and blow up their inboxes for weeks on end until the amplitude fixes can be scheduled within a sprint.
Lastly, the slack integration is flat blocked in our organization due to the inability for non org domains to send email based alerts to channels.
At scale, we understand that some level of schema misalignment is likely, viewing this information intuitively across multiple projects is a challenge. What to do with the violation data becomes heavily personal, and I realize there is no one size fits all implementation for notification settings. As such, we would find it extremely valuable to have a stream of events flow out of the platform based on the validation errors that are generated from schema violations.
Either webhooks or an S3/Datalake stream much like what is already available for event data would be ideal.
While this could be a high volume of data, it would allow us to capture and route this information to anywhere we deemed appropriate and track trends on error rates for production projects.
Long term, we would love to send this data stream to a platform such as https://prometheus.io/ or one of the other similar variants, to be able to capture schema error rates at a point in time and over time. This is very similar to application performance monitoring tools such as DataDog or NewRelic, and would greatly extend our capability to capture and detect systemic issues at scale, across dozens, potentially hundreds of projects.
I realize that there is some overlap with this request and the custom monitors feature, but the reality is that most of the rules I want to track are already in the Event Plan Schema and the current process of rebuilding these same rules within a chart context to get granular alert notifications is simply not sustainable.