Amazon CloudWatch: What It Is and When to Use It

Q: What is the difference between CloudWatch and CloudTrail?

CloudWatch monitors operational data — "what is happening / how is it performing?" — metrics, logs, alarms, traces. CloudTrail is an audit log — "who did what, when, from where?" — every AWS API call is recorded as a CloudTrail event. The two are complementary and often used together: CloudTrail writes its events to CloudWatch Logs, where metric filters and alarms then generate alerts on suspicious patterns.

Q: When should I use Metric Filters vs Embedded Metric Format (EMF)?

Metric Filters extract numeric values from plain-text log lines and create CloudWatch metrics — useful when logs are already being ingested and you don't control their format. Embedded Metric Format (EMF) is a structured JSON format you emit from your application that CloudWatch automatically parses into metrics without separate PutMetricData calls — it's the modern, cheaper path because it avoids the per-metric API charge and keeps metrics and logs aligned.

Definition

Amazon CloudWatch is AWS's unified monitoring and observability service. It collects metrics, logs, traces (via AWS X-Ray), and events from AWS services, your applications, and on-premises workloads — and exposes them through alarms, dashboards, and query tools. CloudWatch is on by default for most AWS services: EC2, RDS, Lambda, ECS, EKS, S3, and hundreds more automatically publish metrics and (for compute services) logs.

How It Works

CloudWatch is split into several related products that share a common data plane:

CloudWatch Metrics — numeric time-series data with dimensions (e.g., CPUUtilization by InstanceId). Standard resolution is 1-minute; high-resolution metrics go down to 1-second.
CloudWatch Alarms — fire actions when a metric crosses a threshold for N datapoints. Actions include SNS notifications, EC2 Auto Scaling, Systems Manager Ops, and Lambda.
CloudWatch Logs — log groups (one per source) contain log streams (one per log producer). Logs are retained for a configurable period, encrypted, and searchable with Logs Insights.
CloudWatch Logs Insights — a purpose-built query language for fast log analysis across log groups.
CloudWatch Dashboards — customizable widgets that visualize metrics and log queries.
CloudWatch Events / Amazon EventBridge — originally "CloudWatch Events," this evolved into the standalone EventBridge service, which remains deeply linked with CloudWatch.
CloudWatch Synthetics — scripted "canaries" that check endpoints and UI flows from AWS-hosted browsers.
CloudWatch RUM (Real User Monitoring) — browser-side performance and error telemetry.
Container Insights — detailed metrics, logs, and traces for ECS, EKS, Kubernetes.
Lambda Insights — memory, CPU, and network metrics per Lambda function.
Application Signals — APM-style service maps and SLO tracking (newer feature).

Key Features and Limits

Default metrics — every AWS service publishes baseline metrics at 5-minute resolution (free), or 1-minute resolution on EC2 if "detailed monitoring" is enabled (paid).
Custom metrics — put your application metrics into CloudWatch with the PutMetricData API.
High-resolution metrics — down to 1-second granularity; useful for spiky workloads and sub-minute alarms.
Log retention — configurable from 1 day to forever (default is "never expire," which is a common cost leak).
Log subscriptions — stream logs to Kinesis Data Streams / Firehose / Lambda for real-time processing.
Metric filters — extract numeric metrics from log text.
Composite alarms — combine multiple alarms with AND/OR/NOT to reduce noise.
Anomaly detection alarms — ML-based dynamic thresholds instead of fixed values.
CloudWatch Agent — unified agent for EC2 and on-premises servers to publish system-level metrics and logs.
Embedded Metric Format (EMF) — structured logs that CloudWatch automatically turns into metrics, avoiding extra PutMetricData calls.
Integrations — native integration with AWS Config, AWS Organizations, Security Hub, GuardDuty, AWS Backup, AWS Chatbot (for Slack/Teams), and X-Ray.

Common Use Cases

Infrastructure monitoring — CPU, memory, disk, network across the fleet.
Application performance monitoring — custom metrics + logs + traces tied to requests.
Alerting — CloudWatch Alarms → SNS → email / SMS / PagerDuty.
Auto scaling — CloudWatch metrics drive EC2 Auto Scaling Groups, ECS Service Auto Scaling, DynamoDB auto-scaling, Aurora Auto Scaling.
Log analytics — Logs Insights queries across ALB, CloudFront, VPC Flow Logs, Lambda, application logs.
Synthetic uptime and journey testing — Synthetics canaries verify endpoints and login flows.
Container observability — Container Insights for ECS / EKS metrics and logs per container.
Compliance — CloudWatch Logs is the typical destination for CloudTrail, VPC Flow Logs, and audit trails.

Pricing Model

CloudWatch charges per dimension of data. The most common bill lines:

Metrics — per custom metric-month. First 10 metrics are free. Detailed monitoring on EC2 is paid.
Alarms — per alarm-month. Composite alarms count once.
Logs ingestion — per GB of logs ingested, with a separate (cheaper) rate for Logs Infrequent Access tier.
Logs storage — per GB-month of stored log data.
Logs Insights queries — per GB of data scanned.
Dashboards — first 3 dashboards free; after that, a monthly fee per dashboard.
Synthetics — per canary run.
RUM — per event.
Container Insights / Application Signals / Contributor Insights — separate per-resource or per-event charges.

The AWS Free Tier includes 10 metrics, 10 alarms, 5 GB of log ingestion, 3 dashboards, and 1,000,000 API requests per month.

Common cost leaks: never-expiring log retention, high-cardinality custom metrics, and running dashboards in every Region.

Pros and Cons

Pros

Zero setup for the baseline: every AWS service publishes metrics automatically.
Unified across compute, networking, storage, databases, containers.
Rich alarm and composite-alarm capabilities.
Logs Insights is fast and uses a simple purpose-built DSL.
Direct integration with EventBridge, Lambda, SNS, and Auto Scaling.

Cons

Log ingest and long retention can become the dominant AWS bill line if unconfigured.
Per-metric cost scales with cardinality — expensive if you create metrics per user or per request.
Dashboards are functional but less polished than Grafana / Datadog.
Cross-Region dashboards and alarms require extra configuration.
No full APM distributed tracing — that's X-Ray's job.

Comparison with Alternatives

| | CloudWatch | Managed Grafana / Prometheus | Datadog / New Relic | | --- | --- | --- | --- | | Source | Native on AWS services | Grafana + Prometheus + AMP | SaaS agents | | Logs | Yes (Logs) | Loki / CloudWatch | Yes | | Traces | X-Ray integration | Tempo | Yes (APM) | | Dashboards | Built-in | Grafana (very polished) | Native (very polished) | | Cost at scale | AWS bill line | Lower for high-cardinality, but operational overhead | Higher; most polished UX | | Best for | Default observability on AWS | Teams wanting open-source stack | Polished SaaS APM across clouds |

Exam Relevance

Cloud Practitioner (CLF-C02) — know CloudWatch is the monitoring service and that it provides metrics, logs, and alarms.
Solutions Architect Associate (SAA-C03) — metric-driven Auto Scaling, CloudWatch Alarms → SNS → email/SMS, Logs as audit trail, integration with Lambda.
Developer Associate (DVA-C02) — custom metrics via PutMetricData, CloudWatch Agent on EC2, Embedded Metric Format, Lambda Insights, structured JSON logging.
SysOps Administrator (SOA-C02) — heavy coverage: agent configuration, log retention, composite alarms, Synthetics canaries, Contributor Insights, CloudWatch cost optimization.
DevOps Professional (DOP-C02) — SLO monitoring with Application Signals, automated remediation via EventBridge + Lambda, deployment canaries.

Frequently Asked Questions

Q: What is the difference between CloudWatch and CloudTrail?

A: CloudWatch monitors operational data — "what is happening / how is it performing?" — metrics, logs, alarms, traces. CloudTrail is an audit log — "who did what, when, from where?" — every AWS API call is recorded as a CloudTrail event. The two are complementary and often used together: CloudTrail writes its events to CloudWatch Logs, where metric filters and alarms then generate alerts on suspicious patterns like ConsoleLoginFailed spikes.

Q: How much does CloudWatch Logs cost, and how do I keep it in check?

A: Logs bill you three ways: ingestion (per GB ingested), storage (per GB-month), and Logs Insights query (per GB scanned). Common cost-optimization tactics: set explicit retention periods on every log group (30 / 90 / 365 days depending on compliance needs), turn down verbose log levels in production, use the Logs Infrequent Access tier for logs that rarely need to be queried, sample high-volume logs (e.g., keep 1% of successful requests), and export older logs to S3 (where storage is up to 20× cheaper).

Q: When should I use Metric Filters vs Embedded Metric Format (EMF)?

A: Metric Filters extract numeric values from plain-text log lines and create CloudWatch metrics — useful when logs are already being ingested and you don't control their format (e.g., ALB access logs). Embedded Metric Format (EMF) is a structured JSON format you emit from your application that CloudWatch automatically parses into metrics without separate PutMetricData calls — it's the modern, cheaper path because it avoids the per-metric API charge and keeps metrics and logs aligned. For new applications, prefer EMF.

This article reflects AWS features and pricing as of 2026. AWS services evolve rapidly — always verify against the official Amazon CloudWatch documentation before making production decisions.

Amazon CloudWatch: What It Is and When to Use It

Definition

How It Works

Key Features and Limits

Common Use Cases

Pricing Model

Pros and Cons

Comparison with Alternatives

Exam Relevance

Frequently Asked Questions

Q: What is the difference between CloudWatch and CloudTrail?

Q: How much does CloudWatch Logs cost, and how do I keep it in check?

Q: When should I use Metric Filters vs Embedded Metric Format (EMF)?

More in Monitoring

AWS X-Ray: Distributed Tracing for Microservices

AWS Systems Manager (SSM): Fleet Ops, SSH, Patching

Amazon EventBridge: Serverless Event Bus and Pipes

AWS CloudTrail: Audit Logs for Every AWS API Call