Is Lambda cheaper than running a container 24/7?

Usually yes for bursty or low-to-medium traffic, because Lambda scales to zero and you pay nothing when idle. A small Fargate task or EC2 instance costs the same whether it serves 1 request or 1 million. The break-even arrives only at high, steady throughput where a continuously busy container amortizes its fixed cost below Lambda's per-invocation price. Model your actual request rate and duration in the AWS Pricing Calculator rather than assuming.

How do I make a Lambda function idempotent?

Because SQS, Kinesis, and asynchronous triggers deliver at least once, store a deduplication key (an order ID, message ID, or hash of the payload) in DynamoDB with a conditional write, and skip processing if the key already exists. The AWS Lambda Powertools library ships a ready-made idempotency utility that handles this pattern, including in-progress locks and configurable TTLs.

AWS Lambda: What It Is and When to Use It

Definition

AWS Lambda is a serverless, event-driven compute service that runs your code in response to events and automatically manages the underlying compute resources. You upload a function (as a ZIP file or container image), configure a trigger (an API Gateway request, an S3 upload, a DynamoDB stream, a scheduled EventBridge rule, or a direct SDK invocation), and AWS runs it on demand — scaling from zero to thousands of concurrent executions in seconds and charging only for the time your code is actually running.

How It Works

Lambda follows an event-driven model. An event source invokes a function, Lambda spins up an isolated execution environment (a lightweight microVM built on the Firecracker hypervisor), loads your code, and runs the handler:

Event sources — synchronous (API Gateway, Application Load Balancer, Function URLs, direct Invoke), asynchronous (S3, SNS, EventBridge), or poll-based (SQS, Kinesis, DynamoDB Streams, Kafka, MSK).
Execution environment — a sandboxed microVM with your chosen runtime (Node.js, Python, Java, .NET, Go, Ruby, or a custom runtime via the Runtime API or a container image up to 10 GB).
Handler function — the entry point AWS calls with an event payload and a context object describing the invocation. It returns a result or throws, and Lambda captures the log to CloudWatch Logs.
Scaling — for each concurrent request, Lambda reuses a warm environment if one is available (the "warm path"); otherwise it provisions a fresh one (the cold start). Environments are kept alive for some minutes between invocations.

Functions can join a VPC to reach private resources (RDS, ElastiCache, or on-premises through Direct Connect). Versions and Aliases allow immutable deployments and weighted-traffic rollouts.

Key Features and Limits

Memory: 128 MB to 10,240 MB in 1 MB increments. vCPU scales proportionally with memory — more memory = more CPU.
Timeout: 1 second to 15 minutes per invocation.
Ephemeral storage (/tmp): 512 MB up to 10,240 MB, configurable per function.
Deployment package: 50 MB zipped / 250 MB unzipped via direct upload; up to 10 GB via container image.
Default concurrency: 1,000 concurrent executions per account per Region (soft quota, raisable).
Environment variables: up to 4 KB per function.
Payload size: 6 MB for synchronous invocation, 256 KB for asynchronous.
Layers: up to 5 Layers per function; each Layer can be up to 250 MB unzipped. Great for sharing dependencies across functions.
Cold starts: 100 ms – several seconds depending on runtime, package size, and VPC configuration. Mitigated with Provisioned Concurrency or Lambda SnapStart (Java, Python 3.12+, .NET 8+).
Response streaming: stream large responses back to HTTP callers in chunks (up to 20 MB).
Function URLs: built-in HTTPS endpoints without API Gateway.
Lambda extensions: sidecar-like processes that share the execution environment — used by observability vendors (Datadog, New Relic) to flush telemetry.
Function state via Durable Execution (via Step Functions): for long-running orchestration up to 1 year.

Common Use Cases

REST and GraphQL API backends — Lambda behind API Gateway (REST or HTTP API) is the canonical serverless web backend.
S3 event processing — resize images on upload, scan PDFs with Textract, index documents into OpenSearch.
Data pipeline steps — Kinesis Data Streams or MSK trigger Lambda for ETL and enrichment; the processed records flow into S3, DynamoDB, or Redshift.
Scheduled jobs and automation — EventBridge Rules invoke Lambda on a cron schedule for maintenance tasks (expire RDS snapshots, rotate secrets, cleanup CloudWatch Log Groups).
Glue code between services — SNS topic fan-out, EventBridge filtering, cross-account event forwarding, CloudTrail response automation.
IoT and mobile backends — AWS IoT Rules or AWS AppSync resolvers delegate business logic to Lambda.
Event-driven integration with generative AI — orchestrate Bedrock Agents, call Amazon Q, or run RAG retrieval on S3 + OpenSearch.

Pricing Model

Lambda pricing has three components:

Request pricing — per 1 million requests. Free Tier includes 1 million requests/month forever.
Duration pricing — priced per GB-second (memory × duration). Free Tier includes 400,000 GB-seconds/month. The x86 rate is higher than Graviton (Arm); switching a compatible function to arm64 cuts duration cost by ~20%.
Provisioned Concurrency — you pre-warm N concurrent environments at a lower per-invocation rate and pay a GB-second cost for the standby time.

Extras: data transfer out, ephemeral storage beyond 512 MB, and function URLs / API Gateway each have their own bill lines. Use the AWS Pricing Calculator to model spiky workloads — Lambda's "pay for what you use" model is usually cheaper than EC2 below a certain sustained utilization, and more expensive above it.

Pros and Cons

Pros

No servers to manage — patching, scaling, and high availability are AWS's job.
Scale to zero — you pay nothing when there is no traffic.
Fine-grained scaling — each request gets an execution environment, up to the concurrency limit.
Native integrations with 200+ AWS services through EventBridge and direct triggers.
Per-function IAM roles enforce least privilege.

Cons

15-minute hard limit — long-running workloads need Step Functions, Fargate, or EC2.
Cold starts — still painful for latency-sensitive APIs in Java/.NET without SnapStart or Provisioned Concurrency.
Deployment package size limits make large ML models cumbersome without container images.
Statelessness — all state must live in external services (DynamoDB, ElastiCache, S3).
Cost surprises at very high, steady request volumes — Fargate or EC2 may be cheaper past a certain break-even point.

Comparison with Alternatives

| | Lambda | Fargate | EC2 | | --- | --- | --- | --- | | Unit of work | Function (one invocation) | Container task (long-running) | Virtual machine | | Max runtime | 15 min | Unlimited | Unlimited | | Scale-to-zero | Yes | Partial (tasks count = 0) | No (without tricks) | | Cold start | Sub-second to several seconds | 30–90 seconds | Minutes | | Pricing | Per request + GB-second | Per vCPU-second + GB-second | Per instance-second | | Best for | Event-driven glue, short APIs | Long-running stateless containers | Legacy apps, HPC, long-running services |

Compared with Google Cloud Functions and Azure Functions, Lambda has the deepest ecosystem of triggers and integrations, but all three offer similar core mechanics.

Exam Relevance

Lambda is core material across several AWS certifications:

Cloud Practitioner (CLF-C02) — what serverless means, why it matters.
Solutions Architect Associate (SAA-C03) — picking Lambda vs ECS vs Fargate, API Gateway + Lambda patterns, scaling limits, VPC impact on cold starts.
Developer Associate (DVA-C02) — heavy coverage: Lambda Layers, SnapStart, concurrency (reserved vs provisioned), versioning, aliases, traffic shifting, environment variables, SAM templates, error handling with Dead Letter Queues / on-failure destinations.
DevOps Professional (DOP-C02) — CodeDeploy Canary/Linear deploys of Lambda, Step Functions orchestration, observability via X-Ray and Powertools.

Key exam gotchas: synchronous retry semantics differ from asynchronous (async retries twice, then goes to DLQ); reserved concurrency also acts as a throttle (capping maximum concurrency), while provisioned concurrency pre-initializes environments but does not itself limit scale.

Common Pitfalls

Recursive invocation loops. A function that writes to the same S3 bucket (or SNS topic / DynamoDB table) that triggers it can invoke itself indefinitely — a classic way to run up a surprise bill overnight. AWS added recursive-loop detection for S3→Lambda and SNS/SQS→Lambda, but treat it as a backstop, not a design: scope triggers to a specific prefix or write to a different bucket.
Idempotency under at-least-once delivery. SQS, Kinesis, and asynchronous invocations can deliver the same event more than once. Handlers must be idempotent — dedupe on a business key in DynamoDB — or you will double-charge cards and double-send emails.
Throttling cascades. Reserved concurrency caps one function but also carves that capacity out of the account's shared pool. Pin reserved concurrency on a hot function and unrelated functions elsewhere can start throwing TooManyRequestsException.
VPC + NAT data-transfer bills. A VPC-attached function that calls the public internet (or an AWS API with no VPC endpoint) routes through a NAT Gateway and pays per-GB processing on every call. Use Gateway/Interface VPC endpoints for S3, DynamoDB, and Secrets Manager.
Silent async failures. Asynchronous invocations retry twice and then drop the event unless you attach a Dead Letter Queue or on-failure destination — data loss with no error ever surfaced to the caller.
CloudWatch Logs cost creep. Verbose logging on a high-volume function can cost more than the compute itself. Set a log-retention policy (the default is "never expire") and drop debug logs in production.

A Worked Pricing Example

Suppose an API backend runs 5 million invocations/month, each using 512 MB of memory and averaging 200 ms on x86:

Requests: 5,000,000 − 1,000,000 (Free Tier) = 4,000,000 billable → 4 × $0.20 = $0.80
Compute: 0.5 GB × 0.2 s = 0.1 GB-second per call × 5,000,000 = 500,000 GB-seconds. Subtract the 400,000 Free-Tier GB-seconds → 100,000 billable × $0.0000166667 = ≈ $1.67
Monthly total: ≈ $2.47 — before data transfer and any API Gateway charges.

Switching the same function to arm64 (Graviton) cuts the compute rate ~20%, and right-sizing memory can shorten duration enough to lower total GB-seconds further. The lesson: at this scale Lambda is effectively free; the break-even versus an always-on Fargate task or EC2 instance only arrives at high, sustained throughput.

Frequently Asked Questions

Q: What is a Lambda cold start and how can I reduce it?

A: A cold start is the latency incurred when Lambda has to initialize a fresh execution environment before running your handler — typically 100 ms to several seconds, worst for Java and .NET. You can reduce cold starts by using Lambda SnapStart (for Java, Python 3.12+, .NET 8+), enabling Provisioned Concurrency, avoiding large dependencies, and keeping the function out of a VPC when possible (or using VPC-aware Hyperplane ENIs which are now the default).

Q: How is Lambda billed?

A: Lambda bills you per request and per GB-second of execution duration (memory allocated × time, rounded to the millisecond). The Free Tier includes 1 million requests and 400,000 GB-seconds per month indefinitely. Provisioned Concurrency and the Arm (Graviton) architecture are separate rates; Arm is roughly 20% cheaper for supported runtimes.

Q: When should I choose Lambda instead of ECS or EC2?

A: Choose Lambda when your workload is event-driven, bursty, runs for less than 15 minutes per invocation, and benefits from scale-to-zero. Choose ECS/Fargate for long-running stateless containers or workloads that exceed Lambda's limits. Choose EC2 for workloads that need full OS control, specific hardware, or are cheaper to run on reserved capacity at high sustained utilization.

This article reflects AWS features and pricing as of 2026. AWS services evolve rapidly — always verify against the official AWS Lambda documentation before making production decisions.