Amazon Forecast: What It Is and When to Use It
Note: As of July 29, 2024, Amazon Forecast is no longer available to new customers. Existing customers can continue to use the service, but AWS is encouraging users to transition to Amazon SageMaker Canvas for time-series forecasting, which offers faster model building and more cost-effective predictions.
Definition
Amazon Forecast is a fully managed time-series forecasting service that uses machine learning (ML) to deliver highly accurate predictions. It is designed for developers and data scientists who need to predict business metrics like product demand, inventory levels, web traffic, or resource requirements without requiring deep ML expertise.
How It Works
Amazon Forecast automates the complex workflow of creating time-series forecasts. The process involves three main stages:
-
Import Datasets: You begin by providing your historical data, which is uploaded to Amazon S3. Forecast can work with three types of datasets which are organized into a
Dataset Group:- Target Time Series (TTS): This is the only mandatory dataset. It contains the historical values of the metric you want to predict (e.g., daily sales), along with a timestamp and an item identifier.
- Related Time Series (RTS): This optional dataset includes time-varying data that you believe influences the target metric, such as pricing, promotions, or weather data.
- Item Metadata: This optional, static dataset contains features related to your items, like color, category, or brand. This data is crucial for "cold start" scenarios—predicting demand for new items with no historical data.
-
Train a Predictor: A predictor is the custom forecasting model that Amazon Forecast trains on your data. You have two main options for training:
- AutoML: This is the recommended approach for most users. Forecast automatically inspects your data, performs feature engineering, and tests multiple algorithms (from traditional statistical models like ARIMA and ETS to complex deep learning models like DeepAR+ and CNN-QR) to select the one that produces the most accurate model.
- Manual Algorithm Selection: If you have specific requirements, you can manually choose an algorithm like DeepAR+, Prophet, or ARIMA. During training, Forecast also provides backtesting capabilities to generate accuracy metrics (like WAPE and RMSE) that help you evaluate the model's performance.
-
Generate Forecasts: Once the predictor is trained and has an
Activestatus, you can use it to generate predictions for a specified future time horizon. You can generate forecasts for all items in your dataset or a specific subset. The output can be a point forecast or probabilistic forecasts at different quantiles (e.g., p10, p50, p90), which provide a range of likely outcomes. Forecasts can be queried via an API or exported to an S3 bucket in CSV or Parquet format.
Key Features and Limits
- Automated Machine Learning (AutoML): Automatically selects the best algorithm and performs feature engineering, simplifying the model creation process.
- State-of-the-Art Algorithms: Utilizes a blend of statistical (ARIMA, ETS, Prophet) and deep learning (DeepAR+, CNN-QR) algorithms developed at Amazon.
- Forecast Explainability: Provides insights into how factors like price, holidays, or weather influence your forecast values through impact scores.
- Weather Index & Holidays: Can automatically incorporate historical weather data and built-in holiday calendars for many countries to improve accuracy.
- Missing Value Support: Offers various automated filling methods (e.g., zero, mean, median) to handle missing data points in your time series.
- Cold-Start Forecasting: Can generate forecasts for new items with no historical data by leveraging item metadata to find similar products.
- Service Quotas (Limits): AWS accounts have default quotas that can often be increased upon request. Key quotas include the maximum number of datasets per dataset group, predictors, and forecasts. Always check the official documentation for the most current limits.
Common Use Cases
- Retail and Inventory Planning: Predict product demand to optimize stock levels, reduce waste, and improve in-stock availability.
- Supply Chain and Logistics: Forecast the need for raw materials and finished goods to streamline manufacturing and logistics operations.
- Resource and Capacity Planning: Predict requirements for staffing, energy consumption, or cloud server capacity to optimize operational efficiency.
- Financial Planning: Forecast key business metrics like revenue, cash flow, and expenses for better financial management.
- Web Traffic Forecasting: Predict website traffic to ensure infrastructure can handle demand and to plan for marketing campaigns.
Pricing Model
Amazon Forecast has a pay-as-you-go pricing model with no upfront commitments. Costs are incurred across four dimensions:
- Data Ingestion: You are charged per gigabyte (GB) of data imported into Forecast.
- Model Training: Billing is based on the number of hours required to train a predictor.
- Forecast Generation: You pay per 1,000 forecasted data points generated.
- Forecast Explainability: There is a separate charge for generating explainability reports, based on the number of explained data points.
A limited free tier is available for the first two months for new AWS customers, which includes a monthly allowance for data storage, training hours, and forecast generation. For detailed and current pricing, always refer to the official Amazon Forecast pricing page and use the AWS Pricing Calculator.
Pros and Cons
Pros:
- Fully Managed Service: Eliminates the need to manage underlying infrastructure, data pipelines, or ML model deployment.
- High Accuracy: Leverages advanced ML algorithms used by Amazon.com, often resulting in more accurate forecasts than traditional methods.
- Ease of Use: Does not require deep machine learning expertise; the AutoML feature automates much of the complex work.
- Scalability: Capable of generating forecasts for millions of items.
- Integration: Integrates well with other AWS services like S3, IAM, and CloudWatch.
Cons:
- No Longer Available for New Customers: As of July 2024, new users are directed to Amazon SageMaker Canvas.
- Cost: For very large datasets and frequent retraining, costs can accumulate across data ingestion, training, and inference.
- Data Preparation: While Forecast handles missing values, the quality of the forecast is highly dependent on the quality and formatting of the input data, which can require significant upfront effort.
- Training Time: Training a predictor, especially with AutoML on large datasets, can take several hours.
- Limited Customization: As a managed service, it offers less control over model architecture and hyperparameters compared to building a custom model in Amazon SageMaker.
Comparison with Alternatives
Amazon Forecast vs. Amazon SageMaker:
- Target Audience: Amazon Forecast is designed for users who want a fully automated forecasting solution without managing the ML lifecycle. Amazon SageMaker is a comprehensive platform for data scientists and ML engineers who need full control to build, train, and deploy custom models.
- Control vs. Automation: Forecast prioritizes automation and ease of use. SageMaker provides granular control over every step, from data preprocessing and algorithm selection (including its built-in DeepAR algorithm) to hyperparameter tuning and endpoint configuration.
- Effort: Getting started with Forecast is significantly faster. Building a comparable solution in SageMaker requires more coding, infrastructure knowledge, and ML expertise.
- Recommended Path: AWS now recommends Amazon SageMaker Canvas (a visual, no-code/low-code part of SageMaker) as the primary tool for time-series forecasting, citing faster performance and a more cost-effective prediction model.
Exam Relevance
Amazon Forecast is a relevant topic for the AWS Certified Machine Learning - Specialty (MLS-C01) exam. Candidates should understand:
- When to use Forecast: Know its primary use cases (e.g., demand, resource, inventory forecasting) and when it is a better choice than building a custom model in SageMaker.
- Core Concepts: Be familiar with the key components like Target Time Series, Related Time Series, Item Metadata, Predictors, and Forecasts.
- Data Requirements: Understand the importance of data formatting and how features like related data and metadata can improve model accuracy.
- Key Features: Know the purpose of AutoML, cold-start forecasting, and forecast explainability.
Note: The MLS-C01 certification is scheduled to be retired after March 31, 2026.
Frequently Asked Questions
Q: What is the 'cold start' problem and how does Amazon Forecast handle it?
A: The 'cold start' problem refers to the challenge of forecasting demand for new items that have no historical sales data. Amazon Forecast addresses this by using an optional 'Item Metadata' dataset. This dataset contains attributes about the items (e.g., category, color, brand). The ML model learns the relationships between these attributes and the demand for existing products, and then uses that understanding to generate a forecast for a new, similar item.
Q: What is the difference between Target Time Series and Related Time Series data?
A: The Target Time Series is the primary, mandatory dataset containing the historical values of the metric you want to predict (e.g., sales_count). The Related Time Series is an optional dataset that contains other time-varying data that might influence your target metric, such as price, promotions, or weather_temperature. Providing relevant related time series data can significantly improve the accuracy of your forecasts.
Q: Can I choose which algorithm Amazon Forecast uses?
A: Yes. While the default and recommended option is AutoML, where Forecast automatically selects the best algorithm for your data, you can also manually select a specific algorithm. Amazon Forecast provides a range of options, including traditional statistical models like ARIMA and Prophet, as well as advanced deep learning algorithms like DeepAR+ and CNN-QR.
This article reflects AWS features and pricing as of 2026. AWS services evolve rapidly — always verify against the official AWS documentation before making production decisions.