EC2 Auto Scaling Group: What It Is and How It Works

Definition

An EC2 Auto Scaling Group (ASG) is the AWS construct that keeps a fleet of Amazon EC2 instances at the size you want, across the Availability Zones you want, launched from the template you want. You tell an ASG the minimum, desired, and maximum number of instances; you attach a Launch Template; and you configure scaling policies that react to load, schedules, or predictive forecasts. EC2 Auto Scaling handles the rest — launching replacements for unhealthy instances, rebalancing across AZs, and scaling in and out in response to CloudWatch metrics. ASGs are the foundation of horizontally-scaled, self-healing EC2 workloads on AWS.

How It Works

An ASG evaluates its state continuously against three capacity settings:

  • Minimum capacity — floor the fleet never drops below.
  • Maximum capacity — ceiling the fleet cannot exceed.
  • Desired capacity — the current target, which scaling policies adjust.

When the desired capacity diverges from the actual running count, the ASG launches or terminates instances to converge. Launches are driven by a Launch Template (Launch Configurations are legacy and do not support newer features), which specifies the AMI, instance type(s), IAM instance profile, user data, security groups, and block-device mapping.

ASGs rebalance across the Availability Zones you select and can be attached to one or more Elastic Load Balancer target groups so new instances are registered automatically and unhealthy ones drained. Health signals come from EC2 status checks, ELB health checks, or custom VPC Lattice/Route 53 application-recovery-controller signals.

Scaling policies

  • Target tracking — pick a metric (average CPU, ALB RequestCountPerTarget, custom CloudWatch metric) and a target value; ASG scales to keep the metric near the target. Simplest and recommended for most workloads.
  • Step scaling — define step adjustments based on alarm breach magnitude.
  • Simple scaling — single adjustment per alarm; mostly superseded by step.
  • Scheduled scaling — change capacity on a cron schedule (for example, scale up at 8am weekdays).
  • Predictive scaling — ML-based forecasts schedule proactive capacity changes before load arrives.

Lifecycle and operations

  • Lifecycle hooks — pause instances in Pending:Wait (pre-launch bootstrap) or Terminating:Wait (drain, backup) so external automation can do work before the instance joins or leaves the fleet.
  • Instance Refresh — rolling replacement of all instances in an ASG to pick up a new Launch Template version or AMI; supports skip-matching, healthy percentage targets, and rollback.
  • Warm Pools — pre-initialized stopped instances that accelerate scale-out for slow-booting workloads.
  • Health checks — EC2 + ELB + custom; unhealthy instances are replaced.
  • Termination policiesdefault, OldestInstance, NewestInstance, OldestLaunchTemplate, AllocationStrategy (for Spot), or a custom list.
  • Cooldown periods — default 300s pause after a scaling action to avoid flapping (target-tracking ignores the static cooldown).

Mixed-instances policy

A single ASG can launch across multiple instance types and purchase options (On-Demand + Spot) using a mixed-instances policy and attribute-based instance-type selection. You set an On-Demand base capacity (for example, 2 instances always On-Demand) and an On-Demand percentage above base (say, 20% On-Demand, 80% Spot), with price-capacity-optimized as the Spot allocation strategy. This is the canonical way to blend Spot savings with guaranteed capacity in a self-healing fleet.

Key Features and Limits

  • Up to 20 AZs per ASG (one region).
  • Lifecycle hook timeout default 3600s (configurable up to 48 hours).
  • Instance Refresh supports minimum-healthy-percentage, skip-matching, and automatic rollback to the prior template version on failure.
  • Warm Pools can hold stopped, hibernated, or running-but-drained instances.
  • Standby state — move an instance out of service for patching/debug without terminating it.
  • Default 200 ASGs and 200 launch configurations per region (raise via Service Quotas).
  • Quota by vCPU — underlying EC2 quotas still apply.
  • Integration — ALB/NLB target groups, Route 53 health checks, CloudWatch alarms, EventBridge, Systems Manager, Auto Scaling groups as CloudFormation/Terraform resources.

Common Use Cases

  1. Self-healing web and API tiers — front-end ASG behind an ALB, scale on RequestCountPerTarget or CPU.
  2. Batch worker fleets — Spot-heavy ASG scaled on SQS queue depth via target tracking.
  3. Scheduled business hours — scale up weekdays 8–18 local, down outside.
  4. Resilient Kubernetes / ECS node groups — ASG underneath EKS managed node groups or ECS capacity providers.
  5. Blue/green deployments — Instance Refresh rolls a new AMI through the ASG with rollback safety.
  6. Disaster recovery — scheduled or event-driven scale-up in a secondary region.

Pricing Model

EC2 Auto Scaling itself is free — you pay only for the underlying resources: EC2 instance-seconds, EBS volumes, data transfer, and attached load balancers. Predictive scaling and lifecycle hooks cost nothing extra. Warm Pools incur normal EBS volume storage charges for the stopped instances and brief instance-seconds during pre-initialization.

Because ASGs can blend Spot and On-Demand, they are also one of the most effective cost-optimization tools on AWS when combined with Compute Savings Plans for the baseline.

Pros and Cons

Pros

  • Automatic self-healing — unhealthy instances are replaced without human intervention.
  • Elastic capacity aligned with real load (target tracking) or forecasts (predictive).
  • Native blending of On-Demand and Spot via mixed-instances policies.
  • Instance Refresh enables safe, rollback-capable AMI rollouts.
  • Tight integration with ELB, ECS, EKS, CloudWatch, EventBridge.

Cons

  • Poorly-tuned policies can oscillate (flap) — cooldowns and instance warmup matter.
  • Stateful workloads need careful termination-protection and lifecycle-hook design.
  • Scaling-out latency is bounded by AMI/user-data boot time (use Warm Pools to shortcut).
  • Launch Configurations are legacy — teams still on them lose access to newer features.

Comparison with Alternatives

| | EC2 ASG | Application Auto Scaling | Karpenter / Cluster Autoscaler | Fargate auto scale | | --- | --- | --- | --- | --- | | Target | EC2 fleets | ECS services, DynamoDB, etc. | EKS node provisioning | ECS/EKS tasks on Fargate | | Policies | Target/step/scheduled/predictive | Target/step/scheduled | Pod-driven, bin-packed | Target tracking on task count | | Spot mix | Built-in mixed policy | N/A | Built-in | Capacity providers with Fargate Spot | | Self-healing | Yes | N/A (task level via ECS) | Node-level | Task-level |

Application Auto Scaling handles non-EC2 targets (DynamoDB, Aurora read replicas, ECS services). Karpenter replaces ASG-based cluster autoscaling inside EKS with faster, bin-packed node provisioning. For plain EC2 fleets, ASG is still the canonical answer.

Exam Relevance

  • Solutions Architect Associate (SAA-C03) — pick the right scaling policy; recognize when to use mixed-instances with Spot; know that ASG replaces unhealthy instances automatically.
  • Developer Associate (DVA-C02) — lifecycle hooks for bootstrapping, user-data interactions, instance warmup time.
  • SysOps Administrator (SOA-C02) — cooldown periods, health-check grace periods, termination policies, Instance Refresh configuration.
  • DevOps Professional (DOP-C02) — Instance Refresh for blue/green AMI rollouts, Warm Pools for fast scale-out, predictive scaling.

Common exam trap: an ASG's health check grace period defaults to 0 seconds for EC2 and 300s when ELB health checks are enabled — an instance can be terminated during boot if the grace period is too short.

Frequently Asked Questions

Q: What is the difference between a Launch Template and a Launch Configuration?

A: A Launch Template is the current, actively-developed resource for describing how an ASG launches instances — it supports newer features (multiple instance types, On-Demand + Spot mixes, T-unlimited, licensing configurations, placement group targeting, IMDSv2 defaults, network card assignments) and is versioned so you can roll forward and back. A Launch Configuration is the legacy resource; it is immutable, does not support mixed instances or newer features, and is deprecated for new use. Always use Launch Templates.

Q: How do I safely roll out a new AMI through an ASG?

A: Use Instance Refresh. You bump the AMI ID in a new Launch Template version, start an Instance Refresh with a minimum healthy percentage (for example 90%), and optionally enable auto-rollback. The ASG replaces instances in batches, respects lifecycle hooks and ELB draining, and — if health checks fail beyond the configured tolerance — rolls back to the prior Launch Template version automatically.

Q: When should I use target tracking versus step scaling?

A: Target tracking is the default recommendation — you pick a metric like average CPU or ALBRequestCountPerTarget, set a target value, and the ASG maintains it without you writing alarms. Use step scaling when you need multi-step reactions to alarm magnitude (for example, +1 instance on 70% CPU but +4 on 90% CPU) or when target tracking cannot express your metric (some custom or composite signals). Avoid simple scaling — it is superseded by step scaling.


This article reflects AWS features and pricing as of 2026. AWS services evolve rapidly — always verify against the official Amazon EC2 Auto Scaling documentation before making production decisions.

Published: 4/17/2026 / Updated: 4/19/2026

This article is for informational purposes only. AWS services, pricing, and features change frequently — always verify details against the official AWS documentation before making production decisions.

More in Compute