Healthcare

Healthcare Data Platform: 10 Million Events Per Day

A Canadian health analytics company replaced failing 14-hour batch jobs with a real-time streaming platform processing 10 million health events per day under PIPEDA data…

🏢 HealthMetrics 📍 Canada ⏱ 5 months
10M
Events/day
400ms
P99 query time
Zero
Downtime in 6 months

""We went from yesterday's data to data we can act on today. That changes clinical decisions." — Dr. Michael Okafor, Chief Medical Officer"

The Challenge

Nightly batch jobs that took 14 hours to complete. By the time clinicians received insights, the data was almost a day old. The pipeline regularly failed mid-run with no alerting — the team discovered issues only when clinicians complained about missing dashboards.

Our Approach

A streaming-first architecture on Apache Kafka with Canadian data residency (PIPEDA) baked into every data flow decision.

  • Apache Kafka replacing batch jobs with real-time event streaming at 10M events/day
  • Apache Flink for stateful stream processing with exactly-once delivery semantics
  • TimescaleDB for time-series storage with automatic data retention policies
  • All infrastructure deployed exclusively in AWS ca-central-1 for PIPEDA compliance

Results

Pipeline latency: 14 hours to under 5 minutes. P99 query time: 400ms. Zero unplanned downtime in the first 6 months of production operation.

Technologies Used

Python Apache Kafka Apache Flink TimescaleDB PostgreSQL Grafana AWS