EC2 Spot Instances: What They Are and How to Use Them

Definition

Amazon EC2 Spot Instances let you run EC2 workloads on AWS's spare compute capacity at discounts of up to 90% off On-Demand pricing. In exchange for the discount, AWS can reclaim the capacity with a two-minute interruption notice whenever demand for On-Demand or Reserved capacity rises in that instance pool. Spot is the right economic answer for fault-tolerant, flexible workloads — batch jobs, CI/CD runners, stateless web tiers, Kubernetes worker nodes — and it is one of the single biggest cost-optimization levers in the AWS compute catalog.

How It Works

Spot capacity is organized into capacity pools, where a pool is a unique combination of instance type, Availability Zone, and (for some features) platform. When you request Spot capacity, EC2 looks for an available pool that satisfies your constraints (instance types, AZs, vCPU/memory requirements via attribute-based instance-type selection). If one exists, the instance launches. If AWS later needs that capacity back for On-Demand, the Spot instance is reclaimed, following the interruption behavior you configured:

  • Terminate (default) — the instance is stopped, root EBS is deleted (unless you override), and you get billed for the seconds used.
  • Stop — the EBS-backed instance is stopped, preserving the volume. When capacity returns and you start it, work resumes. Requires a persistent Spot request.
  • Hibernate — RAM is serialized to EBS and restored on resume, for applications that cannot cheaply rebuild in-memory state.

AWS publishes a Spot interruption notice via the instance metadata service (/latest/meta-data/spot/instance-action) and EventBridge roughly two minutes before reclamation, and a rebalance recommendation earlier when interruption risk rises. Handlers should drain connections, checkpoint, deregister from target groups, and exit cleanly.

Request types

  • One-time request — a single Spot instance, not re-launched after interruption.
  • Persistent request — automatically re-places the instance when capacity frees up (combine with stop or hibernate to preserve state).
  • EC2 Fleet / Spot Fleet — launch a set of Spot (and optionally On-Demand) instances across many instance types and AZs, with a target capacity and an allocation strategy (price-capacity-optimized, capacity-optimized, lowest-price, diversified).

price-capacity-optimized is AWS's recommended default — it picks pools with the lowest price among those with the most available capacity, minimizing both cost and interruption risk.

Key Features and Limits

  • Discount: up to 90% off On-Demand; typical is 60–80%.
  • Warning: 2-minute interruption notice via IMDS and EventBridge; rebalance recommendation issued earlier.
  • Spot Fleet / EC2 Fleet: mix Spot + On-Demand, define allocation strategy, attribute-based instance-type selection (e.g., "any >= 4 vCPU, 16 GB, x86_64").
  • Interruption behavior: terminate, stop, or hibernate (stop/hibernate require persistent request and EBS-backed AMI).
  • Spot Blocks (deprecated): fixed-duration Spot from 1–6 hours was retired in 2021 for new customers — do not rely on them in exam answers or production designs.
  • Capacity Reservations + Spot: you can layer On-Demand Capacity Reservations to cover critical baseline capacity and use Spot for everything else.
  • Integration: Auto Scaling Groups (mixed instances policy), ECS capacity providers, EKS managed node groups, Karpenter, AWS Batch, EMR instance fleets, SageMaker managed Spot training.
  • Spot Advisor: a dashboard showing the historical frequency of interruption per instance family and AZ, and potential savings, so you can pick resilient pools.
  • No bidding: the "bid price" model was retired; you simply pay the current Spot price (optionally capped by a MaxPrice).

Common Use Cases

  1. Batch and data-processing workloads — AWS Batch, EMR, Glue on Spot can slash costs for Monte Carlo simulations, ETL, genomics, and rendering.
  2. CI/CD runners — GitHub Actions, GitLab, and Jenkins build fleets running ephemeral jobs.
  3. Stateless web and API tiers — behind a load balancer with an ASG that blends Spot and On-Demand for a baseline.
  4. Kubernetes worker pools — node groups tagged as Spot-tolerant with taints/tolerations and PodDisruptionBudgets; tools like Karpenter handle rebalancing.
  5. Machine-learning training — SageMaker Managed Spot Training checkpoints to S3 and resumes on interruption, cutting training bills by 70–90%.
  6. Dev/test environments — short-lived workloads where interruption is an inconvenience, not an outage.

Pricing Model

Spot prices float over time based on supply and demand in each pool. Billing is per second (60-second minimum on Linux, per-hour on Windows < 1 minute). You pay the current Spot price, which you can optionally cap with a MaxPrice; if the price exceeds your cap, the instance is interrupted. Most workloads leave MaxPrice unset and let the price float.

Because prices are per pool, spreading across many pools (different instance types and AZs) tends to both lower the effective price and reduce interruption frequency. Layer Compute Savings Plans over an On-Demand baseline and run the burst capacity on Spot to minimize total spend.

Pros and Cons

Pros

  • The deepest discount available for EC2 capacity — up to 90% off.
  • Seamless integration with ASG, ECS, EKS, Batch, EMR, and SageMaker.
  • Attribute-based instance selection frees you from hard-coding instance types.
  • Two-minute warning + rebalance recommendations enable graceful handling.

Cons

  • Interruptions are unavoidable — workloads must be fault-tolerant.
  • State preservation (stop/hibernate) requires configuration and EBS-backed AMIs.
  • Capacity can evaporate for specific pools during regional demand spikes.
  • Unsuitable for long, uninterruptible jobs without checkpointing.

Comparison with Alternatives

| | Spot | On-Demand | Reserved / Savings Plans | Fargate Spot | | --- | --- | --- | --- | --- | | Discount | Up to 90% off | 0% | Up to 72% off | Up to 70% off | | Commitment | None | None | 1 or 3 years | None | | Interruptions | Yes, 2-min notice | No | No | Yes, 2-min SIGTERM | | Best for | Fault-tolerant batch / stateless | Short-term, unpredictable | Steady baseline | Containerized fault-tolerant tasks |

Rule of thumb: cover steady baseline with Savings Plans, critical peaks with On-Demand Capacity Reservations, and everything flexible with Spot.

Exam Relevance

  • Solutions Architect Associate (SAA-C03) — recognize Spot as the cheapest EC2 pricing option with a 2-minute interruption notice; design fault-tolerant workloads that absorb interruptions.
  • Developer Associate (DVA-C02) — handle the Spot interruption notice via IMDS / EventBridge and implement graceful shutdown.
  • SysOps Administrator (SOA-C02) — Auto Scaling Group mixed-instances policies, capacity-optimized allocation strategies, Spot Fleet vs EC2 Fleet differences.
  • DevOps Professional (DOP-C02) — blending Spot with Savings Plans, designing resilient EKS/ECS node groups, integrating Spot into CI/CD and data pipelines.

Common exam trap: answers involving "Spot Blocks for a guaranteed uninterrupted duration" are wrong — Spot Blocks were deprecated in 2021. If a question requires guaranteed duration, pick On-Demand or Capacity Reservations.

Frequently Asked Questions

Q: What happens when a Spot Instance is interrupted?

A: AWS posts a Spot interruption notice to the instance metadata (/latest/meta-data/spot/instance-action) and EventBridge approximately two minutes before reclamation. Depending on the interruption behavior you configured, the instance is then terminated (default), stopped (EBS-backed, with state preserved for a persistent request), or hibernated (RAM serialized to EBS). During the two-minute window, a handler should drain load-balancer connections, checkpoint progress, flush buffers, and exit.

Q: How do I reduce Spot interruption frequency?

A: Diversify across many capacity pools — multiple instance types (via attribute-based instance-type selection) and multiple Availability Zones — and use the price-capacity-optimized allocation strategy in EC2 Fleet or mixed-instances ASGs. Consult the Spot Advisor for pools with historically low interruption rates, and subscribe to EC2 Instance rebalance recommendation events so you can proactively drain instances before the 2-minute notice.

Q: Can I run production services on Spot?

A: Yes, for stateless or checkpoint-able components. Typical patterns include: a Spot-heavy ASG behind an Application Load Balancer with a small On-Demand baseline, ECS or EKS node groups that blend Spot and On-Demand capacity providers, and AWS Batch or SageMaker training jobs that checkpoint to S3 and resume. Stateful or latency-sensitive workloads (primary databases, session-bound services without sticky sessions) should remain on On-Demand or Reserved capacity.


This article reflects AWS features and pricing as of 2026. AWS services evolve rapidly — always verify against the official Amazon EC2 Spot Instances documentation before making production decisions.

Published: 4/17/2026 / Updated: 4/17/2026

This article is for informational purposes only. AWS services, pricing, and features change frequently — always verify details against the official AWS documentation before making production decisions.

More in Compute