Case Study : Real-time Data Streaming & Analytics with Apache Kafka & Apache Druid

Real-time Data Streaming & Analytics with Kafka & Druid

The customer is one of  the IoT Gateways, Controllers, Distributed Energy and Sub metering manufacturing company having presence across the world

Druid Kafka Supervisor

This project was for collecting and analysing the various types of real time events through their IoT devices installed on the various telecom towers across multiple locations.

challenge

Customer has wide range of smart dataloggers/gateway/ controllers via IoT devices installed on telecom towers. They wanted to measure and analysed the real-time energy consumption data including fault on each telecom tower.

The real-time generated events or signals are being consolidated and sending to multiple adapters through http/https.  This process needs immediate and seamless communication between the event processing system and subsequently real-time analytics database that delivers sub-second queries on streaming data.

solution

  • We integrated Apache Kafka with Third party systems through their APIs to collect the real-time events generating and consolidating continuously by various Iot devices installed on telecom towers. We integrated the schema registry for real-time data validation whenever it is published/sending from  adapters.
  • We developed producer to fetch the data from the adapters and publishing to multi-cluster brokers continuously
  • Apart from that, we developed the Kafka supervisor spec which is mandatory for streaming data consumption by Apache Druid
  • Configure Druid with Kafka’s broker to fetch the published events/data continuously for analysing using Druid’s SQL query engine
  • Configure Apache Druid with Amazon S3 and HDFS for deep storage of events for future analysis

outcome

  • Kafka Installation &Configuration
  • Data/Event consumption from the adapters and ingest to Kafka brokers
  • Apache Druid Installation , configuration and integration with Kafka brokers
  • Developed Kafka supervisor spec for Druid
  • Querying and analyzing  all real-time data/event  for decision making