Use Case: Building Fraud Detection & Analysis System for Credit Card Transaction using Kafka-Flink-Druid

Fraud Detection Alert and Analysis for Credit Card Transactions using Kafka-Flink-Druid

This is an in-house project Irisidea has executed to check the Fraud Detection with the Data-stream API of Apache Flink and subsequent analysis using Druid during Credit Card transactions. Initial use case has been uploaded at official Apache Flink Website

Credit Card Fraud Detection Alert and Analysis-using-kafka-flink-druid

This use case was to build a fraud detection system for alerting on suspicious credit card transactions by collecting and analyzing the real-time credit card transaction events using open source data streaming technologies Apache Kafka, Flink and Druid.

challenge

  • Produce real time credit card transactional stream every second (configurable) using third party fake data generator API.

    Note: Irisidea developed a simulator to produce real time UPI transactional stream every second (configurable) for the data required for this project. This was needed, as no financial entity would provide their real-time transactional data to test the real-time streaming data processing within Flink engine due to financial security and Government’s data privacy/protect laws.

solution

  • Integrated multi-broker Apache Kafka with the credit card transaction data simulator to publish real time event every second. .
  • Multi-node Apache Kafka cluster has been used to ingest the real-time data stream.
  • Developed code that reads JSON data from broker and writes it back with simple window grouping for processing and publishing back to different topics in multi-cluster brokers continuously.
  • Designed and developed a data pipeline to trigger email alert on fraud transaction. Flink engine executes its fraud detection business logic and process the credit card transactional data stream to filter out the fraud transactions data stream which ultimately lands at the Kafka Topic.
  • Developed the Kafka supervisor specifications, which was mandatory for streaming data consumption by Apache Druid.
  • Configured Apache Druid with Kafka’s broker to fetch the published events/data continuously for analyzing using Druid’s SQL query engine.

outcome

  • Flink, Kafka and Druid Installation & Configuration
  • Data/Event consumption from the simulator and ingestion to Kafka brokers
  • Integration of multi-node Kafka broker with Apache Flink
  • Developed java code to identify fraud credit card  transaction  and fraud  transactional data stream writing back to different  Kafka topic after separation from normal flow  for a specific time window.
  • Executed a data pipeline to trigger email alerts for a fraud transactions after consuming data from Kafka topic
  • Developed Kafka supervisor specifications for Apache Druid
  • Querying and analyzing all real-time data/events for the statistic of successful and fraud transactions.