Amazon Kinesis: What It Is and When to Use It
Definition
Amazon Kinesis is a platform of four managed services for collecting, processing, and analyzing real-time streaming data at any scale. Whether you need to ingest millions of clickstream events per second, stream video from IoT cameras, or run continuous SQL analytics on log data, Kinesis provides a purpose-built service for each layer of the streaming pipeline. The four services are Kinesis Data Streams, Amazon Data Firehose (formerly Kinesis Data Firehose), Amazon Kinesis Video Streams, and Amazon Managed Service for Apache Flink (formerly Kinesis Data Analytics).
How It Works
Each Kinesis service addresses a different part of the streaming architecture:
- Kinesis Data Streams (KDS) — a durable, ordered, real-time data streaming service. Producers (applications, IoT devices, agents) put records into a stream. The stream is divided into shards, each providing 1 MB/s ingress and 2 MB/s egress. Consumers (Lambda, KCL applications, Managed Flink) read records in order within each shard. Data is retained for 24 hours by default, extendable up to 365 days. On-demand mode automatically manages shard count based on throughput, eliminating capacity planning.
- Amazon Data Firehose — a fully managed delivery service that captures streaming data and loads it into destinations like S3, Redshift, Amazon OpenSearch Service, Splunk, HTTP endpoints, and third-party SaaS (Datadog, MongoDB, etc.). Firehose handles batching, compression, encryption, and optional transformation via Lambda — with zero administration. It can receive data directly from producers or from a Kinesis Data Stream.
- Kinesis Video Streams — ingests and stores video, audio, and time-serialized data from cameras, depth sensors, and RADAR devices. Provides APIs for playback, HLS streaming, and integration with Amazon Rekognition Video for ML analysis.
- Amazon Managed Service for Apache Flink — runs Apache Flink applications for real-time stream processing. Supports both SQL and Java/Python Flink applications. Reads from KDS, MSK (Kafka), or S3 and writes to KDS, Firehose, S3, DynamoDB, and more.
A typical architecture: IoT devices push telemetry into KDS, a Managed Flink application performs windowed aggregations and anomaly detection in real-time, and Firehose delivers raw and enriched data to S3 for batch analytics in Athena.
Key Features and Limits
- Ordering — KDS guarantees ordering per shard via partition keys. Consumers see records in the exact order they were written within a shard.
- Enhanced fan-out — dedicated 2 MB/s throughput per consumer per shard using HTTP/2 push, reducing latency to ~70 ms.
- On-demand vs provisioned — on-demand mode auto-scales up to 200 MB/s ingress; provisioned mode lets you control exact shard count.
- Server-side encryption — KMS encryption at rest for KDS and Firehose.
- Firehose buffering — configurable buffer size (1-128 MB) and interval (60-900 seconds); data is delivered when either threshold is met.
- Firehose dynamic partitioning — automatically partition delivered data in S3 by keys in the record (e.g.,
customer_id), enabling efficient downstream queries. - Lambda integration — KDS and Firehose both integrate with Lambda for record transformation.
- KDS retention — 24 hours default, up to 365 days with extended retention.
- Shard limits — default 500 shards per account per Region (adjustable). Each shard: 1 MB/s in, 2 MB/s out, 1,000 records/s in.
Common Use Cases
- Clickstream analytics — capture user interactions in real-time for personalization, A/B testing, and funnel analysis.
- IoT telemetry — ingest sensor data from thousands of devices, process with Flink, archive to S3.
- Log and event aggregation — stream application logs via Firehose to S3 or OpenSearch for search and alerting.
- Real-time dashboards — aggregate metrics with Managed Flink and push to DynamoDB or ElastiCache for sub-second dashboard updates.
- Video analytics — stream camera feeds through Video Streams for Rekognition-powered object detection.
- Fraud detection — analyze transaction streams with Flink to flag suspicious patterns within seconds.
- Data lake ingestion — Firehose as the ingestion pipe into an S3-based data lake in Parquet format.
Pricing Model
Each Kinesis service has its own pricing model:
- KDS Provisioned — per shard-hour (~$0.015/shard-hour) plus per PUT payload unit ($0.014 per million). Extended retention and enhanced fan-out add surcharges.
- KDS On-Demand — per stream-hour plus per GB ingested (~$0.08/GB). Simpler but pricier at sustained high throughput.
- Amazon Data Firehose — per GB ingested (~$0.029/GB for S3 in us-east-1). Format conversion and dynamic partitioning add surcharges.
- Video Streams — per GB ingested plus per GB consumed/stored.
- Managed Flink — per KPU-hour (1 vCPU, 4 GB memory). Application storage charged separately.
- Free Tier — none for KDS or Firehose.
Pros and Cons
Pros
- Purpose-built services for every layer of the streaming stack — ingest, deliver, process, and store.
- KDS provides durable, ordered, replayable streams — consumers can reprocess data.
- Firehose is truly zero-admin: no capacity planning, no consumer code, automatic delivery.
- On-demand mode eliminates shard management for unpredictable workloads.
- Deep AWS integration: Lambda triggers, S3, Redshift, OpenSearch, CloudWatch.
Cons
- KDS provisioned mode requires shard management and split/merge operations.
- Per-shard pricing can become expensive at very high throughput compared to self-managed Kafka.
- Firehose latency is 60 seconds minimum (buffer interval) — not suitable for sub-second delivery.
- Managed Flink has a learning curve; Flink SQL helps but complex stateful processing requires Java/Python expertise.
- No native cross-Region replication for KDS (you must build it with Lambda or a consumer application).
Comparison with Alternatives
| | Kinesis Data Streams | Amazon MSK (Kafka) | SQS | | --- | --- | --- | --- | | Model | Managed sharded stream | Managed Kafka brokers | Managed message queue | | Ordering | Per shard (partition key) | Per partition | FIFO queues only | | Retention | Up to 365 days | Unlimited (disk-based) | Up to 14 days | | Replay | Yes (consumer offset) | Yes (consumer offset) | No (message deleted after processing) | | Throughput | 1 MB/s per shard (or on-demand) | Scales with broker count | Nearly unlimited | | Best for | Real-time streaming + AWS integration | Kafka ecosystem compatibility | Decoupled microservices |
Exam Relevance
- Cloud Practitioner (CLF-C02) — know Kinesis is for real-time data streaming; Firehose delivers data to S3/Redshift.
- Solutions Architect Associate (SAA-C03) — KDS vs SQS vs SNS, Firehose for log delivery to S3, on-demand vs provisioned, Lambda as consumer.
- Developer Associate (DVA-C02) — KCL (Kinesis Client Library) concepts, partition keys and ordering, enhanced fan-out, Lambda event source mapping for KDS.
- Data Engineer Associate (DEA-C01) — deep coverage: KDS + Flink for real-time analytics, Firehose dynamic partitioning, Firehose to Iceberg tables, KDS vs MSK decision criteria.
- Solutions Architect Professional (SAP-C02) — cross-Region streaming architectures, KDS capacity planning, Firehose + Lambda for ETL-in-flight.
Frequently Asked Questions
Q: What is the difference between Kinesis Data Streams and Firehose?
A: Kinesis Data Streams (KDS) is a durable, ordered stream where you control consumers, retention (up to 365 days), and processing logic — you write the consumer code (or use Lambda/Flink). Firehose is a zero-admin delivery pipe that automatically batches, compresses, and loads data into destinations like S3, Redshift, and OpenSearch — no consumer code required. Use KDS when you need custom processing, replay, or multiple consumers; use Firehose when you just need reliable delivery to a supported destination.
Q: When should I choose Kinesis over SQS?
A: Choose Kinesis when you need ordered, replayable, real-time streaming with multiple consumers reading the same data (fan-out). Choose SQS when you need a simple message queue for decoupling microservices where each message is processed once and ordering is less critical (or use SQS FIFO for strict ordering at lower throughput). Kinesis is for streaming analytics; SQS is for task distribution.
Q: How do I choose between KDS provisioned and on-demand mode?
A: Use on-demand mode when your traffic is unpredictable, bursty, or you want zero capacity management — it auto-scales up to 200 MB/s and you pay per GB ingested. Use provisioned mode when you have stable, predictable throughput and want to optimize cost — per-shard-hour pricing is typically cheaper at sustained high volume. You can switch between modes twice per 24-hour period.
This article reflects AWS features and pricing as of 2026. AWS services evolve rapidly — always verify against the official Amazon Kinesis documentation before making production decisions.