is database built for data in motion
Sub-second queries at any scale
Execute OLAP queries on high-dimensional, high-cardinality data sets with billions to trillions of rows in milliseconds without pre-defining or caching queries.
Maximum concurrency at the cheapest cost
Build real-time analytics apps with constant performance that can handle 100–100,000 queries per second using a highly effective architecture that requires less infrastructure than other databases.
Real-time and historical insights
Druid’s native integration with Apache Kafka and Amazon Kinesis, which allows query-on-arrival at millions of events per second, low latency ingestion, and assured consistency, enables you to fully exploit the potential of streaming data.
A high-performance, real-time analytics database Druid can process queries on streaming and batch data at scale and under heavy demand in less than a second.
Real time ingestion by Druid is as scalable as Kafka!
No connectors needed
Event triggered ingestion
Streaming real time with Druid
Interactive Query Engine
Druid uses Scatter/Gather for high-speed queries, preloading data into RAM or local storage to avoid data movement and network latency
Tiering & QoS
Configurable tiering with Quality of Service enables optimal price-performance ratio for mixed workloads, guarantees priority and avoids resource conflicts
Optimized Data Format
Imported data is automatically columnarized, time-indexed, dictionary-encoded, bitmap-indexed, and type-compressed
Loosely coupled ingestion, query, and orchestration components combined with a deep storage layer enable easy and fast scale-up and scale-out
True Stream Ingestion
A connector-free integration with streaming platforms enables query-on-arrival, high scalability, low latency, and guaranteed consistency
Automated data services including continuous backup, automatic recovery, and multi-node replication ensure high availability and durability
When do you need Druid?
- Insertion rates are very high, but updates are less frequent.
- Most of your queries are aggregation and reporting queries. For example, “group by” queries. You may also have search and scan queries.
- You aim for query latencies of 100 ms to a few seconds.
- Your data has a time component. Druid includes optimizations and design decisions specific to time.
- You may have more than one table, but each query hits only one large distributed table. Queries can potentially access more than one smaller “lookup” table.
- You have columns of data with high cardinality, e.g., URLs, user IDs, and you need fast counting and ranking across those columns.
- You want to load data from Kafka, HDFS, flat files, or object stores such as Amazon S3