Amazon ElastiCache: What It Is and When to Use It
Definition
Amazon ElastiCache is a fully managed in-memory data store and cache service that runs Redis OSS, Valkey, and Memcached engines on AWS. It delivers sub-millisecond latency for read-heavy and compute-intensive workloads by keeping hot data in RAM instead of on disk. AWS handles provisioning, patching, monitoring, scaling, replication, and automatic failover — you pick a node type and a cluster topology, and ElastiCache takes care of the operational plumbing that would otherwise be required to run a production cache fleet.
How It Works
ElastiCache deploys cache nodes — EC2 instances running the chosen engine — inside your VPC. Nodes are grouped into clusters, and clusters can be organized into one of three topologies depending on the engine and scaling strategy:
- Single-node Memcached cluster — one or more independent nodes; the client library hashes keys to choose the node. No replication.
- Redis/Valkey cluster-mode-disabled — one primary and up to five read replicas in a single shard. Supports Multi-AZ with automatic failover.
- Redis/Valkey cluster-mode-enabled — data is sharded across up to 500 shards, each with a primary and up to five replicas. Scales horizontally to hundreds of terabytes of RAM.
Clients connect through the primary endpoint (writes), reader endpoint (read replicas in cluster-mode-disabled), or the configuration endpoint (cluster-mode-enabled, which resolves to all shards). ElastiCache integrates with VPC security groups, KMS for encryption at rest, TLS for in-transit encryption, Redis AUTH / Valkey AUTH or IAM authentication, and CloudWatch for metrics like CPUUtilization, CacheHits, CacheMisses, and Evictions.
For Redis and Valkey, backups to S3 and point-in-time snapshots are supported; Memcached is ephemeral and has no native persistence.
Key Features and Limits
Engine choices
- Valkey — the open-source fork of Redis launched after Redis Labs relicensed. ElastiCache offers Valkey at roughly 33% lower cost than equivalent Redis OSS nodes and is AWS's recommended default.
- Redis OSS — classic BSD-licensed Redis versions; rich data structures (strings, hashes, lists, sets, sorted sets, streams, HyperLogLog, Geo).
- Memcached — simple, multi-threaded key-value cache. No replication, no persistence, no pub/sub.
High availability
- Multi-AZ with automatic failover — Redis/Valkey clusters place replicas in different AZs; if the primary fails, a replica is promoted within roughly 15–30 seconds, and the DNS endpoint is updated automatically.
- Up to 5 read replicas per shard for read scaling and HA.
- Memcached has no built-in replication; use multiple nodes with consistent hashing on the client.
ElastiCache Serverless
ElastiCache Serverless automatically scales storage and compute in response to workload, bills per GB-hour of data stored and per ElastiCache Processing Unit (ECPU) consumed, provides 99.99% availability across three AZs, and reaches hundreds of microseconds of p50 latency. It's available for Valkey, Redis OSS, and Memcached.
Limits
- Redis/Valkey: up to 500 shards per cluster, up to 340 nodes per cluster, up to 635 GB of memory per
cache.r7g.16xlargenode. - Memcached: up to 300 nodes per cluster (soft limit).
- Item size: 512 MB in Redis/Valkey strings; 1 MB default in Memcached.
Common Use Cases
- Database query caching — cache hot rows from RDS/Aurora/DynamoDB to reduce DB load and latency.
- Session stores — centralized HTTP or game session state shared across stateless application servers.
- Leaderboards and ranking — Redis sorted sets (
ZADD/ZRANGE) are the canonical leaderboard primitive. - Rate limiting and token buckets — atomic
INCR/EXPIREfor per-user request throttling. - Real-time analytics — counters, HyperLogLog cardinality estimation, sliding windows.
- Pub/sub and streams — Redis Streams for lightweight event distribution between microservices.
- Full-page and object caching — CMS and SaaS apps front databases with ElastiCache to cut p99 latency.
Pricing Model
- Node-hour pricing — each cache node bills per hour based on node type (e.g.,
cache.t4g.micro,cache.r7g.large). Replicas are billed the same as primaries. - Data transfer — free between nodes in the same AZ of the same cluster; standard AWS rates cross-AZ and out of the Region.
- Backup storage — free up to the size of your cluster; beyond that, per GB-month.
- Reserved Nodes — 1-year or 3-year commitments cut on-demand prices by up to ~55%.
- ElastiCache Serverless — per GB-hour of stored data plus per-million ECPU consumed, with no node management.
The free tier covers 750 hours/month of cache.t2.micro or cache.t3.micro for 12 months.
Pros and Cons
Pros
- Sub-millisecond latency; much faster than disk-based databases.
- Fully managed Multi-AZ, replication, failover, backups, and patching.
- Valkey offers ~33% cost savings vs equivalent Redis OSS.
- Serverless option eliminates capacity planning for spiky workloads.
- Deep VPC, KMS, IAM, and CloudWatch integration.
Cons
- Data in RAM is expensive relative to disk.
- Memcached has no replication, persistence, or Multi-AZ.
- Node-type-hour pricing means idle clusters still cost money (unless using Serverless).
- Cross-AZ replica reads incur inter-AZ data transfer charges.
- Cache invalidation remains an application concern — ElastiCache doesn't solve stale-data problems for you.
Comparison with Alternatives
| | ElastiCache | DynamoDB + DAX | MemoryDB for Redis | Self-managed Redis on EC2 | | --- | --- | --- | --- | --- | | Durability | Snapshots (Redis/Valkey only) | Durable backing store + cache | Multi-AZ durable log | You build it | | Latency | Sub-ms | µs (DAX) + ms (DDB) | Single-digit ms writes, µs reads | Sub-ms | | Use as primary DB | No (ephemeral cache) | Yes | Yes | Possibly | | Engines | Valkey / Redis / Memcached | DynamoDB API | Redis-compatible | Any | | Pricing | Node-hour or Serverless | Per-request + DAX node-hour | Node-hour + write durability | EC2 + ops |
Pick ElastiCache when you need a fast cache in front of another database. Pick MemoryDB when you need Redis semantics as the database of record with multi-AZ durability. Pick DAX when the underlying store is DynamoDB and you want a DynamoDB-native in-cluster cache.
Exam Relevance
- Solutions Architect Associate (SAA-C03) — cache aside vs write-through patterns, Redis Multi-AZ automatic failover, Memcached for horizontal scaling, ElastiCache vs DynamoDB DAX.
- Developer Associate (DVA-C02) — lazy loading, write-through, TTL, cache stampede mitigation, Redis data structures.
- Database Specialty (DBS-C01) — choosing between ElastiCache, MemoryDB, and DAX; Multi-AZ vs cluster-mode; backup/restore; migration from self-managed Redis; Valkey cost positioning.
Classic exam trap: Memcached for simple multi-threaded key-value caching with horizontal scaling and no replication; Redis/Valkey when you need replication, Multi-AZ failover, persistence, pub/sub, or rich data structures.
Frequently Asked Questions
Q: Should I pick Redis, Valkey, or Memcached?
A: Start with Valkey for new workloads — it's API-compatible with Redis OSS, supports the same data structures, replication, Multi-AZ, and clustering, and costs roughly 33% less on ElastiCache. Pick Redis OSS if you need a specific Redis version or feature set you already depend on. Pick Memcached only when you want a pure, simple, multi-threaded key-value cache with horizontal scaling across nodes and do not need replication, persistence, or advanced data structures.
Q: What's the difference between ElastiCache and MemoryDB for Redis?
A: ElastiCache is designed as a cache in front of a durable database; Redis/Valkey snapshots provide crash recovery but the cache is treated as rebuildable. MemoryDB for Redis is designed as a primary database — writes are durably committed to a Multi-AZ transaction log before acknowledgment, giving 99.99% availability and no data loss on failover. MemoryDB costs more per node but eliminates the need for a separate durable store when Redis semantics are sufficient.
Q: How does Multi-AZ automatic failover work in ElastiCache Redis/Valkey?
A: You place the primary and at least one replica in different AZs. ElastiCache continuously monitors the primary; if it becomes unhealthy, the service promotes a replica to primary, updates the DNS endpoint, and begins replicating from the new primary to the remaining replicas. End-to-end failover typically completes in 15–30 seconds. Clients using the primary endpoint pick up the new writer on their next DNS lookup, so long-lived connections should be configured with retry and reconnect logic.
This article reflects AWS features and pricing as of 2026. AWS services evolve rapidly — always verify against the official Amazon ElastiCache documentation before making production decisions.