Amazon S3: What It Is and When to Use It
Definition
Amazon Simple Storage Service (Amazon S3) is an object storage service from Amazon Web Services that lets you store and retrieve any amount of data — from a single text file to exabytes — through a simple HTTPS API. It is designed for 99.999999999% (11 nines) durability and is the foundation for most data-intensive workloads on AWS, including data lakes, backups, static websites, and machine-learning training datasets.
How It Works
S3 is organized around three simple concepts:
- Buckets — named containers for your data. Bucket names are globally unique across all of AWS.
- Objects — the files you store, ranging from 0 bytes to 5 TB each. Every object has a key (its path within the bucket), metadata, and content.
- Keys — the full path that uniquely identifies an object within a bucket (e.g.,
reports/2026/q1.csv).
When you upload an object, S3 automatically replicates it across multiple devices in multiple facilities within an AWS Region, which is why it can guarantee 11 nines of durability. You interact with S3 through the AWS SDKs, the AWS CLI, the Management Console, the REST API, or tools like Terraform and CloudFormation.
S3 also offers strong read-after-write consistency for PUT and DELETE requests across all AWS Regions — once a write succeeds, any subsequent read returns the latest data. This was introduced in 2020 and removes a long-standing class of bugs that used to require application-level workarounds.
Key Features and Limits
- Durability: 99.999999999% (11 nines) — on average, you would lose a single object every 10,000 years if you stored 10 million objects.
- Availability: 99.99% for S3 Standard (Service Level Agreement: 99.9%).
- Object size: 0 bytes to 5 TB per object. Objects larger than 100 MB should use multipart upload; multipart is required for objects over 5 GB.
- Bucket limits: 100 buckets per account by default (soft limit, raisable to 1,000). Bucket names must be globally unique and 3–63 characters.
- Storage classes (8 options as of 2026):
- S3 Standard — frequent access; highest availability.
- S3 Intelligent-Tiering — automatically moves objects across Frequent, Infrequent, Archive Instant, Archive, and Deep Archive tiers based on access patterns.
- S3 Standard-Infrequent Access (Standard-IA) — lower storage cost, retrieval fee, minimum 30-day storage.
- S3 One Zone-IA — single Availability Zone, ~20% cheaper than Standard-IA.
- S3 Express One Zone — single-digit-millisecond latency, 10× faster than Standard, for high-performance request-heavy workloads.
- S3 Glacier Instant Retrieval — millisecond retrieval for archive data accessed quarterly.
- S3 Glacier Flexible Retrieval — minutes-to-hours retrieval; expedited, standard, and bulk options.
- S3 Glacier Deep Archive — cheapest; 12–48-hour retrieval; minimum 180-day storage.
- Security: Block Public Access is on by default for new buckets. Supports bucket policies, IAM policies, Access Points, ACLs, Object Ownership, server-side encryption (SSE-S3, SSE-KMS, SSE-C, DSSE-KMS), and encryption in transit via TLS.
- Data management: Lifecycle policies to transition and expire objects, Versioning, Object Lock (WORM/compliance), Cross-Region and Same-Region Replication, Batch Operations, and S3 Event Notifications.
- Analytics: S3 Storage Lens, Storage Class Analysis, and S3 Inventory.
- Data processing: S3 Object Lambda (transform objects on GET), S3 Select.
Common Use Cases
- Data lakes — storing raw data at scale for query engines like Amazon Athena, Redshift Spectrum, or EMR.
- Static website hosting — serving HTML, CSS, JS, and media assets directly from S3, often behind CloudFront for global distribution.
- Backups and disaster recovery — EBS snapshots, RDS backups, and third-party backup targets land in S3 by default. Cross-Region Replication + Object Lock give regulators-friendly WORM backups.
- Application assets and uploads — user avatars, document uploads, transcoded video — typically written with pre-signed URLs so the client uploads directly to S3 without streaming through your app servers.
- Machine learning training data — S3 is the primary data source for Amazon SageMaker, Bedrock Knowledge Bases, and third-party ML pipelines.
- Log archival — CloudTrail, VPC Flow Logs, ALB access logs, and CloudFront logs all write to S3; lifecycle rules then transition them to Glacier classes.
Pricing Model
S3 bills you for four main dimensions, and costs vary by region and storage class:
- Storage — per GB-month, varies dramatically by class. Glacier Deep Archive is roughly 1/25th the cost of Standard.
- Requests — per 1,000 PUT/COPY/POST/LIST and per 10,000 GET/SELECT. Archive tiers have higher request costs.
- Data transfer out — transfers within the same Region to most AWS services are free; transfers out to the internet are paid; transfers to CloudFront are free (origin fetches).
- Management features — Replication, S3 Inventory, Storage Lens advanced metrics, Lifecycle rules, and Object Tagging each have their own charges.
The AWS Free Tier includes 5 GB of Standard storage, 20,000 GET and 2,000 PUT requests per month for 12 months. Use the AWS Pricing Calculator to model specific workloads.
Pros and Cons
Pros
- Essentially unlimited scale with no capacity planning.
- 11 nines of durability and strong consistency — you rarely need to think about losing data.
- Deep integration with almost every AWS service.
- Rich ecosystem of tools (SDKs, Terraform, rclone, s3cmd, S3-compatible clients).
- Intelligent-Tiering turns cost optimization into a one-checkbox operation for most workloads.
Cons
- Not a filesystem — you cannot mount a bucket as a disk for POSIX workloads (use EFS or FSx for that, or Mountpoint for S3 for read-heavy workloads).
- Per-request charges add up for chatty workloads doing millions of small GETs.
- Data transfer out to the internet is often the most surprising line item on an AWS bill.
- Misconfigured bucket policies or public access settings have historically caused data exposure incidents — auditing tools like IAM Access Analyzer and Macie exist for this reason.
Comparison with Alternatives
| Feature | S3 | EBS | EFS | | --- | --- | --- | --- | | Storage type | Object | Block | File (NFS) | | Access | HTTPS API, global | Attached to 1 EC2 instance (same AZ) | Mounted by many clients | | Scale | Virtually unlimited | Up to 64 TiB per volume | Petabytes; elastic | | Use case | Unstructured data, backups, data lakes | Boot volumes, databases | Shared file storage | | Pricing | Storage + requests + transfer | Provisioned GB + IOPS + throughput | Pay-per-use per GB |
Compared with Google Cloud Storage and Azure Blob Storage, S3 offers a similar model but with the richest integration inside the AWS ecosystem and the most mature tooling.
Exam Relevance
S3 is tested on almost every AWS certification exam:
- Cloud Practitioner (CLF-C02) — definition, durability, bucket basics.
- Solutions Architect Associate (SAA-C03) — choosing the right storage class, lifecycle policies, encryption (SSE-S3 vs SSE-KMS), Cross-Region Replication for DR, static website hosting with CloudFront OAC, VPC Gateway Endpoints.
- Developer Associate (DVA-C02) — pre-signed URLs, multipart upload, S3 event notifications triggering Lambda.
- SysOps Administrator (SOA-C02) — replication rules, Storage Lens, lifecycle-policy troubleshooting.
A high-frequency exam pattern: given a workload pattern (access frequency, retrieval tolerance, retention), pick the cheapest storage class that satisfies the SLA. Memorize minimum storage durations (30 days for Standard-IA/One Zone-IA, 90 days for Glacier Instant/Flexible Retrieval, 180 days for Deep Archive) — getting these wrong is a classic exam trap.
Frequently Asked Questions
Q: How durable is Amazon S3?
A: S3 Standard, Standard-IA, Intelligent-Tiering, and the Glacier classes all offer 99.999999999% (11 nines) of durability across multiple Availability Zones. One Zone-IA and Express One Zone offer the same 11 nines but only within a single AZ, so their effective durability is lower if you lose an entire AZ.
Q: When should I use S3 instead of EBS?
A: Use S3 when your data is unstructured (files, images, backups, logs) and needs to be accessed through an API or by many clients. Use EBS when you need a filesystem mounted to a single EC2 instance — for example, a boot volume, a MySQL data directory, or anything that requires POSIX semantics.
Q: What is the difference between S3 Standard-IA and S3 One Zone-IA?
A: Both have the same retrieval fees and minimum 30-day storage, but Standard-IA replicates across at least three Availability Zones, while One Zone-IA stores data in a single AZ. One Zone-IA is about 20% cheaper but unsuitable for data you cannot easily recreate — if the AZ has a prolonged outage, that data is unavailable (and if the AZ is physically destroyed, the data is lost).
This article reflects AWS features and pricing as of 2026. AWS services evolve rapidly — always verify against the official Amazon S3 documentation before making production decisions.