aws-samples / amazon-forecast-samples Goto Github PK

View Code? Open in Web Editor NEW

508.0 40.0 368.0 121.64 MB

Notebooks and examples on how to onboard and use various features of Amazon Forecast.

License: MIT No Attribution

Jupyter Notebook 99.26% Python 0.74%

amazon-forecast-samples's Introduction

Amazon Forecast Samples

Workshops, Notebooks and examples on how to learn and use various features of Amazon Forecast

Announcements and New Service Features

Introduction and Best Practices

Please visit our growing library which serves as a guide for onboarding data and learning how to use Amazon Forecast.

MLOps: Run a proof of concept (PoC) and learn how to automate production workloads

The purpose of this guidance is to provide customers with a complete end-to-end workflow that serves as an example -- a model to follow. As delivered, the guidance creates forecasted data points from an open-source input data set. The template can be used to create Amazon Forecast Dataset Groups, import data, train machine learning models, and produce forecasted data points, on future unseen time horizons from raw data. All of this is possible without having to write or compile code. Get Started Here

Notebooks

Here you will find examples how to use Amazon Forecast Python SDK to make API calls, with manual waits between API calls. Primary audience is Developers, MLOps Enginners, and Integration Partners who need to see how to put forecasts into production.

Getting started notebooks:
- Get Started with Amazon Forecast
- Upgrading to AutoPredictor
Advanced folder contains notebooks to show API calls for more complex tasks:

License Summary

This sample code is made available under a modified MIT license. See the LICENSE file.

amazon-forecast-samples's People

Contributors

Stargazers

Watchers

Forkers

nhira paulmgithub iut62elec yangcheng pallavinargund hiteshbalapanuru bmoynihan joshchu ironistm edwardefa jwhabi funwithr awssenera dobinyim kiana58 ptbailey michaels72 scbronder kiuwong dfarrel1 rthayil vaeb80 sproulhimself tmgrgg ehsanfar eleai githubmg richardshang rghosh8 chizi15 rikima jmdaniel mehdiait mahendrabairagi rthamman realharry wontonst larssonandreas almoghoro yudho raychew13 andreluiz365 qubitoolbox monamo19 hungph-dev-ict vamshirapolu genums carlosrobles rvippagunta utkarshagarwal1312 lonyzone lilysu hyamauchi watanabejunna alyizzet pysca whn09 normandch sdemarzio zakisaoud77 deividipansera agamat kzheng07 alessandroibrahim99 yuzhoujianxia forkdump rohitmenon83 kaiyue-zheng valeman amit2014 repl-lovejeet-singh oakr halfblood-percyy sanaulmiraj ramblingbiped cagdasdemirer jobenas patpizio sanfam aagostinelli86 axelreed goswamig srikanthnowduru albertovari contino danfac123 lovvge crackend vascencio xujunbj vanoudh vkumar8345 ann2014 luthfiady d3mz mrasumof pearcem0 niladri-b duasahil8 jihys

amazon-forecast-samples's Issues

Adding holiday and manual campaigns to Forecast

AWS Forecast provides the ability to use built-in holidays from Jollyday library inside the supplement features. Is there a way to incorporate other holidays other than national holidays inside our time series model? There is also a case where campaigns are run for certain products, is there a way to add campaigns inside the Forecast?

cannot override parameters for other algorithms

When using the Prophet, I wanted to feed a parameter n_chagepoints=25.
Following the page https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/forecast.html,
I tried

forecast.create_predictor(
    ...
    TrainingParameters={"n_changepoints": "25"}
    ...
)

also tried {"n.changepoints": "25"}.
none of them worked.

Error message is:

InvalidInputException: An error occurred (InvalidInputException) when calling the CreatePredictor operation: Invalid Training parameter: n_changepoints specified for algorithmArn: arn:aws:forecast:::algorithm/Prophet

Is the overriding parameters a supported feature?

AccessDeniedException when calling the CreateDataset operation

I have setup AWS CLI, configured my aws account and added aws service models forecast and forecastquery. I am getting below exception while running -
Traceback (most recent call last):
File "bike_example/create_dataset_and_forecast.py", line 208, in
execute()
File "bike_example/create_dataset_and_forecast.py", line 59, in execute
Schema=schema
File "/home/poojsh/workspace/testpythonmodule/env/TestPoojshPythonModule-1.0/runtime/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/poojsh/workspace/testpythonmodule/env/TestPoojshPythonModule-1.0/runtime/lib/python3.6/site-packages/botocore/client.py", line 661, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDeniedException) when calling the CreateDataset operation:

Error on Create Data Import Job

At running the code for create data import I got this error:

An error occurred (InvalidInputException) when calling the CreateDatasetImportJob operation: No objects found in s3 path s3://bucket-name/AKIAVG6HB65434355, please ensure the path is specified correctly.

Can anyone help me with this? Thank you.

Possible typo in 2_NMT.ipynb

Thx for the great blog! It seems the following line should be topics[n] where n={0,1,2,3} instead.
...

subdf = df[(df['Topic']==topic)&(df['PublishDate']>START_DATE)]
subdf = df[(df['Topic']==topics[1])&(df['PublishDate']>START_DATE)]

Complex Open Dataset

Identify a time-series oriented dataset with the following types of information:

Target Time Series
Related Time Series
Item Metadata

These will be used in other notebooks to illustrate their impact and usage with various algorithms.

SAM Build is failing

(base) C:\Users\dsingha1\git_views\amazon-forecast-samples\ml_ops\visualization_blog>sam build && sam deploy --guided
Building function 'S3Lambda'
Running PythonPipBuilder:ResolveDependencies
Running PythonPipBuilder:CopySource
Building function 'CreateDataset'
Running PythonPipBuilder:ResolveDependencies
Running PythonPipBuilder:CopySource
Building function 'CreateDatasetGroup'
Running PythonPipBuilder:ResolveDependencies
Running PythonPipBuilder:CopySource
Building function 'CreateDatasetImportJob'
Running PythonPipBuilder:ResolveDependencies
Running PythonPipBuilder:CopySource
Building function 'CreatePredictor'
Running PythonPipBuilder:ResolveDependencies
Running PythonPipBuilder:CopySource
Building function 'CreateForecast'
Running PythonPipBuilder:ResolveDependencies
Running PythonPipBuilder:CopySource
Building function 'UpdateResources'
Running PythonPipBuilder:ResolveDependencies

Build Failed
Error: PythonPipBuilder:ResolveDependencies - list index out of range

(base) C:\Users\dsingha1\git_views\amazon-forecast-samples\ml_ops>python
Python 3.7.6 (default, Jan 8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32

(base) C:\Users\dsingha1\git_views\amazon-forecast-samples\ml_ops>sam --version
SAM CLI, version 1.1.0

text data for training

Hello, I wasn't sure where to ask this since it's more of a question than an issue. Would text based data work with AWS Forecast?

For example, we have orders that have names and the size of the order can likely be predicted based on the name of the order. If we're trying to forecast the size of the orders, would Forecast be able to use the name field out-of-the-box as part of the forecasting algorithm?

I hope that makes sense and if i'm asking in the wrong place please let me know where a more appropriate forum is.

Thanks in advance,
-bp

Time estimates from workshop steps

Hello,

To help participants and organizers of workshops to manage time better please add time estimates for the workshop steps: e.g. the step "Train predictor" in https://github.com/aws-samples/amazon-forecast-samples/blob/master/workshops/no_code_workshop/forecast-with-console.md takes about 1h.

Multiple Models Notebook

Create a notebook that builds multiple models given the same dataset:

Prophet
ETS
DeepAR+

Showcase the impact of each model in terms of accuracy with the data provided.

Related Time Series - Holidays

Create a notebook showcasing the impact of including Holidays with DeepAR+.

export models / multivariate timeseries

Hi, thanks for this repository, has been quite helpful! I wonder if you could add examples / point me to good references for:

export the trained models
multivariate time series

basic/Tutorial dependencies

The pip command used in the first basic/Tutorial notebook is malformed. See image below.

I can submit a PR fixing this, however, I can take the opportunity to improve it a little.

Using --upgrade when installing multiple packages together can lead to long install times because of pip backtracking. I stopped it after ~30 minutes.
- I would suggest talking out --upgrade and using version numbers instead. These are the latest version numbers, and I've tested them in my local environment and SageMaker Notebooks too.
```
pip3 install boto3>=1.18.42 pandas>=1.3.3 s3fs>=0.4.2 matplotlib>=3.4.3 ipywidgets>=7.6.5
```
Also, it's not advisable to use !pip install <package> in Jupyter notebooks as it might not install packages in the same execution environment used by the notebook. For example, I have to use pip3 instead for this to work.
- The code snippet below is how you can work around the different execution environments (along with version numbers).
```
import sys
!{sys.executable} -m pip install boto3>=1.18.42 pandas>=1.3.3 s3fs>=0.4.2 matplotlib>=3.4.3 ipywidgets>=7.6.5
```

I've tested it on my local environment, I'll run it on a SageMaker Notebook too to verify the changes are good. If you're happy with the changes, I can submit a PR. 🙂

WhatIf Analysis Notebook

Create a notebook that illustrates how to include metadata for related time series information to perform a what if analysis.

Validation error when calling forecast.create_predictor with CNN-QR algorithm_arn

When using Compare_Multiple_Models notebook, I run into below error when calling “forecast.create_predictor” from SageMaker, with additional algorithm ("CNN-QR") in "algo" array.

An error occurred (ValidationException) when calling the CreatePredictor operation: 1 validation error detected: Value ‘compare_multiple_models_CNN-QR_algo’ at ‘predictorName’ failed to satisfy constraint: Member must satisfy regular expression pattern: ^[a-zA-Z][a-zA-Z0-9_]*

Looks like the arn is correct based on document, and a different error message when trying different algo text. The regular expression pattern seems to not support "-" character. May need to change the validation to ^[a-zA-Z][a-zA-Z0-9_-]* to support CNN-QR algo arn.

InvalidInputException: An error occurred (InvalidInputException) when calling the CreatePredictor operation: Invalid Algorithm ARN specified: arn:aws:forecast:::algorithm/CNN. Options: arn:aws:forecast:::algorithm/CNN-QR,arn:aws:forecast:::algorithm/Deep_AR_Plus,arn:aws:forecast:::algorithm/NPTS,arn:aws:forecast:::algorithm/ETS,arn:aws:forecast:::algorithm/ARIMA,arn:aws:forecast:::algorithm/Prophet

UnknownOperationException on 'Create the Dataset Group'

I am having this issue while running the sample code.

create_dataset_group_response = forecast.create_dataset_group(DatasetGroupName=datasetGroupName, Domain="CUSTOM")

returns the following issue

ClientError Traceback (most recent call last)
<ipython-input-87-6043160b9c44> in <module>
      3 datasetGroupName= project +'_dsg'
      4 s3DataPath = "s3://"+bucket_name+"/"+key
----> 5 create_dataset_group_response = forecast.create_dataset_group(DatasetGroupName=datasetGroupName, Domain="CUSTOM")
      6 datasetGroupArn = create_dataset_group_response['DatasetGroupArn']

C:\ProgramData\Anaconda3\lib\site-packages\botocore\client.py in _api_call(self, *args, **kwargs)
    355                     "%s() only accepts keyword arguments." % py_operation_name)
    356             # The "self" in this scope is referring to the BaseClient.
--> 357             return self._make_api_call(operation_name, kwargs)
    358 
    359         _api_call.__name__ = str(py_operation_name)

C:\ProgramData\Anaconda3\lib\site-packages\botocore\client.py in _make_api_call(self, operation_name, api_params)
    659             error_code = parsed_response.get("Error", {}).get("Code")
    660             error_class = self.exceptions.from_code(error_code)
--> 661             raise error_class(parsed_response, operation_name)
    662         else:
    663             return parsed_response

ClientError: An error occurred (UnknownOperationException) when calling the CreateDatasetGroup operation:

Any ideas of what could be the problem?

Thanks,

lambda handler: KeyError ['params'] in Create dataset step of AWS SNS

While trying to deploy my forecast project in the retail domain I got the following error in the 'create dataset' step function when I start executing it.

KeyError: 'params'Traceback (most recent call last): File "/var/task/dataset.py", line 12, in lambda_handler datasets = event['params']['Datasets']	[ERROR] KeyError: 'params' Traceback (most recent call last): File "/var/task/dataset.py", line 12, in lambda_handler datasets = event['params']['Datasets']

I have uploaded the training dataset into S3 and it looks likes below:

As the training file is available in the specified S3 bucket mention at stack the event must be created, I am not able to find the root cause of the error.
Also attaching my params.json as a txt file.
params.txt

Please help me out....

Error when trying to create object with same name as previously deleted object

I am getting CreateDatasetImportJob errors when trying to create ImportJobs (and in some cases Datasets) with the same name of a previously deleted ImportJobs (and sometimes Datasets). The delete is done using the routine described in the notebook 4.Cleanup

Steps:

create a Dataset with name my_target_ds using create_dataset()
create a ImportJob with name MY_TARGET_DS_IMPORT_JOB using create_dataset_import_job()
delete the ImportJob using delete_dataset_import_job()
delete the Dataset using delete_dataset()
Step 1 again
Step 2 again

Error An error occurred (ResourceAlreadyExistsException) when calling the CreateDatasetImportJob operation: A dataset import job already exists with the arn: .......:dataset-import-job/MY_TARGET_DS_IMPORT_JOB

This error does not happen if I delete Dataset and ImportJOb manually on the console

Future values for related time series dataset

While forecasting item sales using retail domain, the documentation link (https://docs.aws.amazon.com/forecast/latest/dg/retail-domain.html) mentions that we use –

• webpage_hits (float) – The number of web page hits received by the item at the timestamp. Applies only to ecommerce websites.
• stockout_days (float) – The number of days left before the item goes out of stock. This is an optional field. Provide it only if the data is available.
• inventory_onhand (float) – The number of items in inventory.
• revenue (float) – The total revenue generated by that item’s sales.

Using DeepAR+, it is recommended that we provide related time series data for the forecasting horizon period(training horizon+forecasting horizon), I can provide features like pricing, promotion etc. However, for forecasting horizon, I don’t have access to actual & correct figures of webpage hits, stockout_days, inventory_onhand. Moreover, revenue depends on number of items sold.

1) So, how do I impute those values for the forecasting horizon period?
2) Let’s say I use some imputation method; will it not skew my actual forecast results?

HPO Example

Create a notebook that showcases the impact of building with HPO enabled.

amazon-forecast-samples/ml_ops/visualization_blog/testing-data/params_example.json file missing apostrophe on DatasetType

Please correct
"DatasetType": "'TARGET_TIME_SERIES'|'RELATED_TIME_SERIES|ITEM_METADATA'",

to
"DatasetType": "'TARGET_TIME_SERIES'|'RELATED_TIME_SERIES'|'ITEM_METADATA'",

Where is the updated python SDK for the July 22nd 2019 API update?

There is currently a message about a recent API update on the AWS Forecast documentation:

The Amazon Forecast API will undergo significant changes during scheduled maintenance occurring from 10 AM on 7/22/19 until 10 AM on 7/23/19. During maintenance, access to the Forecast APIs and console might be interrupted.

After 7/22/19, your Forecast resources (datasets, predictors, and forecasts) will no longer be available. However, you can save your forecasts for future use. We recommend using the CreateForecastExportJob API to save them to your S3 bucket before 7/22/19.

After maintenance concludes, before using the APIs, you must download a new SDK and modify your existing code to reflect the syntax changes. If you use only the console, you won’t need to make any changes.

We will provide new API documentation before scheduled maintenance begins. If you have questions, contact [email protected]

It says "you must download a new SDK and modify your existing code to reflect the syntax changes".

Where is the latest SDK for Python?

AutoML Notebook

Create a notebook that deploys a predictor and generates a forecast using AutoML and compare its impact to previously generated models.

Issue Seeing Evaluate the Predictor Notebook

I can't pull this up in Chrome. https://github.com/aws-samples/amazon-forecast-samples/blob/master/notebooks/3.Evaluating_Your_Predictor.ipynb

Zip file for the dataset is empty

Hey AWS Forecast team,

The dataset zip file in here: https://github.com/aws-samples/amazon-forecast-samples/tree/master/notebooks/advanced/Clustering_Preprocessing/data is empty. Please let me know if I am missing something.

Best
Preeyank

How can I answer 'what if' questions?

Some algorithms e.g. prophet or DeepAR are able to incorporate exogenous variables (e.g. special events).
This enables us in principal to make 'what if' simulations.

From the API documentation it's not clear for me how those variables can be incorporated into the future dataframe. I can't see any way to provide exogenous variables.
As I understood 'deploy_predictor' starts a forecast directly after the last training point for the specified horizon - Is that true? How can I change the start point for the forecast (without expensive retraining)?

It would be great if you can help me with this question. An example notebook would be even better ;-)
Thank you for your great work!

UnknownServiceError: Unknown service: 'forecast'.

When I was running playbook # Getting Data Ready

at step

session = boto3.Session(region_name='us-west-2') 
forecast = session.client(service_name='forecast') 
forecastquery = session.client(service_name='forecastquery')

I got error

UnknownServiceError                       Traceback (most recent call last)
<ipython-input-4-2d44c42b1f84> in <module>()
      1 session = boto3.Session(region_name='us-west-2')
----> 2 forecast = session.client(service_name='forecast')
      3 forecastquery = session.client(service_name='forecastquery')

~/anaconda3/envs/python3/lib/python3.6/site-packages/boto3/session.py in client(self, service_name, region_name, api_version, use_ssl, verify, endpoint_url, aws_access_key_id, aws_secret_access_key, aws_session_token, config)
    261             aws_access_key_id=aws_access_key_id,
    262             aws_secret_access_key=aws_secret_access_key,
--> 263             aws_session_token=aws_session_token, config=config)
    264 
    265     def resource(self, service_name, region_name=None, api_version=None,

~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/session.py in create_client(self, service_name, region_name, api_version, use_ssl, verify, endpoint_url, aws_access_key_id, aws_secret_access_key, aws_session_token, config)
    837             is_secure=use_ssl, endpoint_url=endpoint_url, verify=verify,
    838             credentials=credentials, scoped_config=self.get_scoped_config(),
--> 839             client_config=config, api_version=api_version)
    840         monitor = self._get_internal_component('monitor')
    841         if monitor is not None:

~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in create_client(self, service_name, region_name, is_secure, endpoint_url, verify, credentials, scoped_config, api_version, client_config)
     77             'choose-service-name', service_name=service_name)
     78         service_name = first_non_none_response(responses, default=service_name)
---> 79         service_model = self._load_service_model(service_name, api_version)
     80         cls = self._create_client_class(service_name, service_model)
     81         endpoint_bridge = ClientEndpointBridge(

~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _load_service_model(self, service_name, api_version)
    115     def _load_service_model(self, service_name, api_version=None):
    116         json_model = self._loader.load_service_model(service_name, 'service-2',
--> 117                                                      api_version=api_version)
    118         service_model = ServiceModel(json_model, service_name=service_name)
    119         return service_model

~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/loaders.py in _wrapper(self, *args, **kwargs)
    130         if key in self._cache:
    131             return self._cache[key]
--> 132         data = func(self, *args, **kwargs)
    133         self._cache[key] = data
    134         return data

~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/loaders.py in load_service_model(self, service_name, type_name, api_version)
    376             raise UnknownServiceError(
    377                 service_name=service_name,
--> 378                 known_service_names=', '.join(sorted(known_services)))
    379         if api_version is None:
    380             api_version = self.determine_latest_version(

UnknownServiceError: Unknown service: 'forecast'. Valid service names are: acm, acm-pca, alexaforbusiness, amplify, apigateway, apigatewaymanagementapi, apigatewayv2, application-autoscaling, application-insights, appmesh, appstream, appsync, athena, autoscaling, autoscaling-plans, backup, batch, budgets, ce, chime, cloud9, clouddirectory, cloudformation, cloudfront, cloudhsm, cloudhsmv2, cloudsearch, cloudsearchdomain, cloudtrail, cloudwatch, codebuild, codecommit, codedeploy, codepipeline, codestar, cognito-identity, cognito-idp, cognito-sync, comprehend, comprehendmedical, config, connect, cur, datapipeline, datasync, dax, devicefarm, directconnect, discovery, dlm, dms, docdb, ds, dynamodb, dynamodbstreams, ec2, ec2-instance-connect, ecr, ecs, efs, eks, elasticache, elasticbeanstalk, elastictranscoder, elb, elbv2, emr, es, events, firehose, fms, fsx, gamelift, glacier, globalaccelerator, glue, greengrass, groundstation, guardduty, health, iam, importexport, inspector, iot, iot-data, iot-jobs-data, iot1click-devices, iot1click-projects, iotanalytics, iotevents, iotevents-data, iotthingsgraph, kafka, kinesis, kinesis-video-archived-media, kinesis-video-media, kinesisanalytics, kinesisanalyticsv2, kinesisvideo, kms, lakeformation, lambda, lex-models, lex-runtime, license-manager, lightsail, logs, machinelearning, macie, managedblockchain, marketplace-entitlement, marketplacecommerceanalytics, mediaconnect, mediaconvert, medialive, mediapackage, mediapackage-vod, mediastore, mediastore-data, mediatailor, meteringmarketplace, mgh, mobile, mq, mturk, neptune, opsworks, opsworkscm, organizations, personalize, personalize-events, personalize-runtime, pi, pinpoint, pinpoint-email, pinpoint-sms-voice, polly, pricing, quicksight, ram, rds, rds-data, redshift, rekognition, resource-groups, resourcegroupstaggingapi, robomaker, route53, route53domains, route53resolver, s3, s3control, sagemaker, sagemaker-runtime, sdb, secretsmanager, securityhub, serverlessrepo, service-quotas, servicecatalog, servicediscovery, ses, shield, signer, sms, sms-voice, snowball, sns, sqs, ssm, stepfunctions, storagegateway, sts, support, swf, textract, transcribe, transfer, translate, waf, waf-regional, workdocs, worklink, workmail, workspaces, xray

Export forecasted values for backtest windows

As per the documentation here , Amazon Forecast provides backtesting to produce evaluation metrics. However, these metrics are at aggregated level. When forecasting for a large number of items,and backtesting(let's say for NumberOfBacktestWindows = 4), sometimes I am interested in looking at the forecasted values of individual items during the back tests.

In the following example, for all the items is it possible to export forecasted values for each 'Testing' window ? It will be nice to see what are some items for which the error metric remains fairly constant over 4 test windows ,which is not possible if it's just aggregated metric.

Thanks in advance!

Flexibility to choose the custom day as start of week for forecasting with weekly granularity

Working with a weekly granularity, is it possible to choose a certain day to start the forecast from instead of using the default which is – “Most recent Monday”? This seems too rigid requirement especially if your business follows a different calendar.
I can pre-process my data to start from Monday but sometimes data- let's consider inventory count gets updated once in a week, let's say on Friday, so starting forecast from Monday with the count from Friday can be misleading .

Issue –
While forecasting I want to use custom calendar day i.e. forecast to start on Saturday and end of Friday instead of Monday to Sunday.
My training data ended on Friday, 23rd, upon training, Amazon Forecast started my prediction from Monday, 26th. Understanding what happened over the weekend(24th-25th) is important especially if some important event happens.

Error while importing util. ModuleNotFoundError for util

Not able to import util.
I have printed out my path, have the same structure of the sample as in Git.
Tried using PYTHONPATH and copying the util folder to the Getting Started, got same error.

Forecast both units and price of products

Hi,
When forecasting units of many forecast dimensions, is it possible to also forecast price (rather than treating price as related series)?

Is it possible to illustrate it in the python notebook?

thank you and kind regards
Aga

Item level Explainability notebook

At 3 different lines, local_file = "instrumentData/.... needs to be changed to local_file = "InstrumentData/...

Issue when deploying Improving Forecast Accuracy with Machine Learning example

I am having the below issue when trying to deploy the example given.

There was an error running the forecast for nyctaxi_weather_auto

Message: An error occurred (InvalidInputException) when calling the CreatePredictor operation: The attribute(s) [day_hour_name] present in the RELATED_TIME_SERIES schema should be of numeric type such as `integer` or `float`, or be added as a forecast dimension

Details: (caught InvalidInputException)

  File "/var/task/shared/helpers.py", line 66, in wrapper
    (status, output) = f(event, context)

  File "/var/task/create_predictor.py", line 40, in handler
    predictor.create()

  File "/var/task/shared/Predictor/predictor.py", line 228, in create
    self.cli.create_predictor(**self._create_params())

  File "/opt/python/botocore/client.py", line 386, in _api_call
    return self._make_api_call(operation_name, kwargs)

  File "/opt/python/botocore/client.py", line 705, in _make_api_call
    raise error_class(parsed_response, operation_name)

I think this is due to that the related dataset only accept additional columns with int / float type. Is there any hints on troubleshooting this on the py file in lambda function? Hope to get some help soon!

Explainability report relies on specific ARN

The getting started notebook - https://github.com/aws-samples/amazon-forecast-samples/blob/main/notebooks/basic/Getting_Started/Amazon_Forecast_Quick_Start_Guide.ipynb includes a section on explainability. However, the explainability export depends on an explainability_arn that was generated in another account. See 'arn:aws:forecast:ap-southeast-1:730750055343:explainability/MY_TAXI_PREDICTOR_HOLIDAY'

This notebook should include the code to produce the explainability report and the code to export it.

EndpointConnectionError: Could not connect to the endpoint URL

Hi,

When I try to access the forecast endpoint, I get Endpoint Connection Error.

Code:

import os
import time
import pandas as pd
import util
%reload_ext autoreload
import boto3
import s3fs


s3 = boto3.Session(
    region_name='us-west-1',
    aws_access_key_id=key_id,
    aws_secret_access_key=access_key
)
forecast = s3.client(service_name='forecast') 
forecastquery = s3.client(service_name='forecastquery')
forecast.list_predictors()

error:

gaierror                                  Traceback (most recent call last)
c:\users\prasanthmunusamyraje\appdata\local\programs\python\python37\lib\site-packages\urllib3\connection.py in _new_conn(self)
    169             conn = connection.create_connection(
--> 170                 (self._dns_host, self.port), self.timeout, **extra_kw
    171             )

c:\users\prasanthmunusamyraje\appdata\local\programs\python\python37\lib\site-packages\urllib3\util\connection.py in create_connection(address, timeout, source_address, socket_options)
     72 
---> 73     for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
     74         af, socktype, proto, canonname, sa = res

c:\users\xxx\appdata\local\programs\python\python37\lib\socket.py in getaddrinfo(host, port, family, type, proto, flags)
    751     addrlist = []
--> 752     for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    753         af, socktype, proto, canonname, sa = res

gaierror: [Errno 11002] getaddrinfo failed

During handling of the above exception, another exception occurred:

NewConnectionError                        Traceback (most recent call last)
c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\httpsession.py in send(self, request)
    331                 decode_content=False,
--> 332                 chunked=self._chunked(request.headers),
    333             )

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\urllib3\connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    755             retries = retries.increment(
--> 756                 method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
    757             )

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\urllib3\util\retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    506             # Disabled, indicate to re-raise the error.
--> 507             raise six.reraise(type(error), error, _stacktrace)
    508 

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\urllib3\packages\six.py in reraise(tp, value, tb)
    769                 raise value.with_traceback(tb)
--> 770             raise value
    771         finally:

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\urllib3\connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    705                 headers=headers,
--> 706                 chunked=chunked,
    707             )

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\urllib3\connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    381         try:
--> 382             self._validate_conn(conn)
    383         except (SocketTimeout, BaseSSLError) as e:

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\urllib3\connectionpool.py in _validate_conn(self, conn)
   1009         if not getattr(conn, "sock", None):  # AppEngine might not have  `.sock`
-> 1010             conn.connect()
   1011 

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\urllib3\connection.py in connect(self)
    352         # Add certificate verification
--> 353         conn = self._new_conn()
    354         hostname = self.host

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\urllib3\connection.py in _new_conn(self)
    181             raise NewConnectionError(
--> 182                 self, "Failed to establish a new connection: %s" % e
    183             )

NewConnectionError: <botocore.awsrequest.AWSHTTPSConnection object at 0x000001886806FA08>: Failed to establish a new connection: [Errno 11002] getaddrinfo failed

During handling of the above exception, another exception occurred:

EndpointConnectionError                   Traceback (most recent call last)
<ipython-input-44-70ba7dab9685> in <module>()
----> 1 forecast.list_predictors()

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\client.py in _api_call(self, *args, **kwargs)
    384                     "%s() only accepts keyword arguments." % py_operation_name)
    385             # The "self" in this scope is referring to the BaseClient.
--> 386             return self._make_api_call(operation_name, kwargs)
    387 
    388         _api_call.__name__ = str(py_operation_name)

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\client.py in _make_api_call(self, operation_name, api_params)
    690         else:
    691             http, parsed_response = self._make_request(
--> 692                 operation_model, request_dict, request_context)
    693 
    694         self.meta.events.emit(

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\client.py in _make_request(self, operation_model, request_dict, request_context)
    709     def _make_request(self, operation_model, request_dict, request_context):
    710         try:
--> 711             return self._endpoint.make_request(operation_model, request_dict)
    712         except Exception as e:
    713             self.meta.events.emit(

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\endpoint.py in make_request(self, operation_model, request_dict)
    100         logger.debug("Making request for %s with params: %s",
    101                      operation_model, request_dict)
--> 102         return self._send_request(request_dict, operation_model)
    103 
    104     def create_request(self, params, operation_model=None):

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\endpoint.py in _send_request(self, request_dict, operation_model)
    135             request, operation_model, context)
    136         while self._needs_retry(attempts, operation_model, request_dict,
--> 137                                 success_response, exception):
    138             attempts += 1
    139             # If there is a stream associated with the request, we need

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\endpoint.py in _needs_retry(self, attempts, operation_model, request_dict, response, caught_exception)
    254             event_name, response=response, endpoint=self,
    255             operation=operation_model, attempts=attempts,
--> 256             caught_exception=caught_exception, request_dict=request_dict)
    257         handler_response = first_non_none_response(responses)
    258         if handler_response is None:

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\hooks.py in emit(self, event_name, **kwargs)
    354     def emit(self, event_name, **kwargs):
    355         aliased_event_name = self._alias_event_name(event_name)
--> 356         return self._emitter.emit(aliased_event_name, **kwargs)
    357 
    358     def emit_until_response(self, event_name, **kwargs):

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\hooks.py in emit(self, event_name, **kwargs)
    226                  handlers.
    227         """
--> 228         return self._emit(event_name, kwargs)
    229 
    230     def emit_until_response(self, event_name, **kwargs):

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\hooks.py in _emit(self, event_name, kwargs, stop_on_response)
    209         for handler in handlers_to_call:
    210             logger.debug('Event %s: calling handler %s', event_name, handler)
--> 211             response = handler(**kwargs)
    212             responses.append((handler, response))
    213             if stop_on_response and response is not None:

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\retryhandler.py in __call__(self, attempts, response, caught_exception, **kwargs)
    181 
    182         """
--> 183         if self._checker(attempts, response, caught_exception):
    184             result = self._action(attempts=attempts)
    185             logger.debug("Retry needed, action of: %s", result)

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\retryhandler.py in __call__(self, attempt_number, response, caught_exception)
    249     def __call__(self, attempt_number, response, caught_exception):
    250         should_retry = self._should_retry(attempt_number, response,
--> 251                                           caught_exception)
    252         if should_retry:
    253             if attempt_number >= self._max_attempts:

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\retryhandler.py in _should_retry(self, attempt_number, response, caught_exception)
    275             # If we've exceeded the max attempts we just let the exception
    276             # propogate if one has occurred.
--> 277             return self._checker(attempt_number, response, caught_exception)
    278 
    279 

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\retryhandler.py in __call__(self, attempt_number, response, caught_exception)
    315         for checker in self._checkers:
    316             checker_response = checker(attempt_number, response,
--> 317                                        caught_exception)
    318             if checker_response:
    319                 return checker_response

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\retryhandler.py in __call__(self, attempt_number, response, caught_exception)
    221         elif caught_exception is not None:
    222             return self._check_caught_exception(
--> 223                 attempt_number, caught_exception)
    224         else:
    225             raise ValueError("Both response and caught_exception are None.")

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\retryhandler.py in _check_caught_exception(self, attempt_number, caught_exception)
    357         # the MaxAttemptsDecorator is not interested in retrying the exception
    358         # then this exception just propogates out past the retry code.
--> 359         raise caught_exception

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\endpoint.py in _do_get_response(self, request, operation_model)
    198             http_response = first_non_none_response(responses)
    199             if http_response is None:
--> 200                 http_response = self._send(request)
    201         except HTTPClientError as e:
    202             return (None, e)

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\endpoint.py in _send(self, request)
    267 
    268     def _send(self, request):
--> 269         return self.http_session.send(request)
    270 
    271 

c:\users\xxx\appdata\local\programs\python\python37\lib\site-packages\botocore\httpsession.py in send(self, request)
    350             raise SSLError(endpoint_url=request.url, error=e)
    351         except (NewConnectionError, socket.gaierror) as e:
--> 352             raise EndpointConnectionError(endpoint_url=request.url, error=e)
    353         except ProxyError as e:
    354             raise ProxyConnectionError(proxy_url=proxy_url, error=e)

EndpointConnectionError: Could not connect to the endpoint URL: "https://forecast.us-west-1.amazonaws.com/"```

Cut SDK from repo

Is there any reason to keep the legacy SDK content in the repo now?

UnknownServiceError

forecast = session.client(service_name='forecast')
UnknownServiceError: Unknown service: 'forecast'. Valid service names are: acm, acm-pca, alexaforbusiness, amplify, apigateway, apigatewaymanagementapi, apigatewayv2, application-autoscaling, application-insights, appmesh, appstream, appsync, athena, autoscaling, autoscaling-plans, backup...etc.

Does 'RetrainPredictor' start the training from scratch?

Does 'RetrainPredictor' start the training from scratch or does it mean 'finetuning'?

If yes. Is it possible to train a base predictor and finetune another model from this starting point on another dataset?

forecastId

forecastId variable hasn't been defined in the notebook.

Pedram

2 different data frequencies

Hi,

Can Amazon Forecast use 2 different frequencies of input target data,
for example weekly and monthly (you get weekly faster for the current month than monthly -
you need to end until the end of the month )?

Lost all commands except get-forecast and get-accuracy-metrics

Hi,

I just download forecast-2019-05-15.normal.json and forecastquery-2019-05-15.normal.json and configured the models with

aws configure add-model --service-name forecast --service-model file://forecast-2019-05-15.normal.json
aws configure add-model --service-name forecast --service-model file://forecastquery-2019-05-15.normal.json

Since doing this, the only commands available are get-forecast and get-accuracy-metrics. If I try to run any of the previously available commands (e.g aws forecast list-recipes) I get the following error:

aws: error: argument operation: Invalid choice, valid choices are:

get-accuracy-metrics                     | get-forecast                            
help

Is this an intended change to the API? And if so, is there updated documentation on using get-forecast?

Missing .format connection string

hi,
it looks like that in this file https://github.com/aws-samples/amazon-forecast-samples/blob/master/ml_ops/visualization_blog/lambdas/updateresources/update.py
at line 65 it is missing the ".format" with the bucket information.
Shouldn't it be " s3_staging_dir='s3://{}/stage'.format(bucket) " ?

basic/Tutorial Evaluating the Predictor updates

Hello, I'm using the 4 basic/Tutorial notebooks to run a demo and encountered a few errors along the way.

In 3.Evaluating_Your_Predictor, forecast_arn doesn't exist. This can be replaced with forecast_arn_deep_ar variable from the previous step.
In 3.Evaluating_Your_Predictor, "data/item-demand-time-validation.csv" doesn't exist. This can be replaced with "data/item-demand-time-validation.csv". However this does not give the desired results, see image below.
- A new new file for validation can be created in 1.Importing_Your_Data with the correct time range for validation.
- The sliced time range in 3.Evaluating_Your_Predictor for the df used to plot the graph can then be updated to reflect the validation data file. This gives us the correct graph, see image below.

Hopefully I was able to explain myself well. I can make the required changes and submit a PR.

🙂

UnknownServiceError

I am getting this 'UnknownServiceError' on creating a session client of service_name 'forecast'.
What's the correct name for forecast?

UnknownServiceError: Unknown service: 'forecast'. Valid service names are: acm, acm-pca, alexaforbusiness, amplify, apigateway, apigatewaymanagementapi, apigatewayv2, application-autoscaling, appmesh, appstream, appsync, athena, autoscaling, autoscaling-plans, backup, batch, budgets, ce, chime, cloud9, clouddirectory, cloudformation, cloudfront, cloudhsm, cloudhsmv2, cloudsearch, cloudsearchdomain, cloudtrail, cloudwatch, codebuild, codecommit, codedeploy, codepipeline, codestar, cognito-identity, cognito-idp, cognito-sync, comprehend, comprehendmedical, config, connect, cur, datapipeline, datasync, dax, devicefarm, directconnect, discovery, dlm, dms, docdb, ds, dynamodb, dynamodbstreams, ec2, ecr, ecs, efs, eks, elasticache, elasticbeanstalk, elastictranscoder, elb, elbv2, emr, es, events, firehose, fms, fsx, gamelift, glacier, globalaccelerator, glue, greengrass, guardduty, health, iam, importexport, inspector, iot, iot-data, iot-jobs-data, iot1click-devices, iot1click-projects, iotanalytics, kafka, kinesis, kinesis-video-archived-media, kinesis-video-media, kinesisanalytics, kinesisanalyticsv2, kinesisvideo, kms, lambda, lex-models, lex-runtime, license-manager, lightsail, logs, machinelearning, macie, marketplace-entitlement, marketplacecommerceanalytics, mediaconnect, mediaconvert, medialive, mediapackage, mediastore, mediastore-data, mediatailor, meteringmarketplace, mgh, mobile, mq, mturk, neptune, opsworks, opsworkscm, organizations, pi, pinpoint, pinpoint-email, pinpoint-sms-voice, polly, pricing, quicksight, ram, rds, rds-data, redshift, rekognition, resource-groups, resourcegroupstaggingapi, robomaker, route53, route53domains, route53resolver, s3, s3control, sagemaker, sagemaker-runtime, sdb, secretsmanager, securityhub, serverlessrepo, servicecatalog, servicediscovery, ses, shield, signer, sms, sms-voice, snowball, sns, sqs, ssm, stepfunctions, storagegateway, sts, support, swf, transcribe, transfer, translate, waf, waf-regional, workdocs, worklink, workmail, workspaces, xray

Notebook links in README broken

The notebooks for the tutorial section are not correctly linked, update this so the links work again.

Predictor ARN Issue

Hello,

I am getting this issue when I run line 15 in notebook 2.

predictor_arn = predictorArn
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-15-70e3f3e2c98d> in <module>()
----> 1 predictor_arn = predictorArn

NameError: name 'predictorArn' is not defined

I am running the notebook in jupyterlab in AWS Sagemaker. Please let me know if I am missing something. I didn't see the above var any where in your code.

I skipped over this line and was able to complete the lab. I dont see that its being used anywhere. Might not be needed.

Thanks for the help!

Dan

Question: How can I read existing dataset_group

Context: Before stopping that aws python tutorial after Getting Started I couldn't proceed anymore without defining "DatasetGroupArn": datasetGroupArn again. So after defining:

create_dataset_group_response = forecast.create_dataset_group()
datasetGroupArn = create_dataset_group_response['DatasetGroupArn']
forecast.create_dataset()
datasetArn = response['DatasetArn']

that is true:

ResourceAlreadyExistsException: An error occurred (ResourceAlreadyExistsException) when calling the CreateDatasetGroup operation: A dataset group already exists with the arn: arn:aws:forecast:eu-west-1:235350444413:dataset-group/util_power_forecastdemo_dsg

I would like just merely read existing the information. How can I do that? Is there some tutorial on methods I would need? At least I couldn't find one.

Coldstart Notebook

Using a dataset with item metadata illustrate how to predict cold start items in a notebook.

MAPE for DeepAR+ & ETS in Amazon forecast

Hi All,

I just created two models - one trained on ETS & the other with DeepAR+ in Amazon forecast. Although Amazon forecast is showing a better MAPE for Deep AR+ as compared to ETS, but the actual forecast for the next month (forecast horizon is 31 days) looks better in ETS. Sharing the screenshots for both. Can any one please explain why this is happening?

error on running notebook

I get this error after running python code at 'forecast.create_dataset' function line.

IAM role for Amazon Forecast: arn:aws:iam::83424617XXXX:role/amazonforecast
Traceback (most recent call last):
File "H:\Run_07_19_2017\AWS_forecast\amazon-forecast-samples-master\run_me2.py", line 200, in
Schema = schema
File "C:\Anaconda3\lib\site-packages\botocore\client.py", line 320, in _api_call
return self._make_api_call(operation_name, kwargs)
File "C:\Anaconda3\lib\site-packages\botocore\client.py", line 624, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.ResourceAlreadyExistsException: An error occurred (ResourceAlreadyExistsException) when calling the CreateDataset operation: Failed to create Dataset

I attach my python file here (changed extension to '.png' to attach'