aws-solutions / content-analysis-on-aws Goto Github PK

As of August 30, 2023, this AWS Solution is no longer available. Existing deployments will continue to run. The functionality provided by Content Analysis on AWS will be superseded with functionality in Media2Cloud on AWS and Content Localization on AWS. We encourage you to explore these solutions.

Home Page: https://aws.amazon.com/solutions/implementations/aws-content-analysis/

License: Apache License 2.0

Shell 5.63% Dockerfile 0.11% Python 19.44% JavaScript 6.64% HTML 0.19% Vue 67.98%

video analytics video-processing machine-learning aws-rekognition aws-transcribe aws-translate aws-mediaconvert aws-comprehend aws-elasticsearch

content-analysis-on-aws's People

Contributors

Stargazers

Watchers

content-analysis-on-aws's Issues

Bounding Boxes do not line up on images correctly

Describe the bug
I have ingested an image (not a video) into my running solution. Analysis is run through Rekognition. But when I try to find the objects that Rekognition has bounding boxes for (e.g. labels with '*'), the bounding boxes do not line up with the picture.

I've tried this on multiple browsers and multiple OS's with the same result. If I zoom my browser (200%+), then the boxes align with the image.

To Reproduce

Ingest an image
After ingestion, click on 'Analyze'
Under 'ML Vision' and 'Objects', select a label that has an asterisk with it
The bounding box should show up, but it will not be correctly aligned with the image

Expected behavior
Bounding boxes appear on top of the detected object.

Please complete the following information about the solution:

[ v2.0.0 ] Version: [e.g. v1.0.0]

To get the version of the solution, you can look at the description of the created CloudFormation stack. For example, "(SO0021) - Video On Demand workflow with AWS Step Functions, MediaConvert, MediaPackage, S3, CloudFront and DynamoDB. Version v5.0.0". If the description does not contain the version information, you can look at the mappings section of the template:

Mappings:
  SourceCode:
    General:
      S3Bucket: "solutions"
      KeyPrefix: "video-on-demand-on-aws/v5.0.0"

[ us-east-1 ] Region: [e.g. us-east-1]
[ N ] Was the solution modified from the version published on this repository?
If the answer to the previous question was yes, are the changes available on GitHub?
Have you checked your service quotas for the sevices this solution uses?
[ N ] Were there any errors in the CloudWatch Logs?

Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

I used this image and saw the behavior: https://d1.awsstatic.com/aws-key-pages/homepage-key-pages/Coca-Cola.df943d34f24c3b17d8299855add943413d426de3.jpg

Additional context
Add any other context about the problem here.

Nested Stack Python Version

Describe the bug
During automated deployment, in the nested stack for the WebStack (aws-content-analysis-web.template) I am getting a rollback error for a Python version. The code is referencing Python3.6 which is no longer supported in Lambda. As the template is referenced in another concatenated bucket URL, it's a little difficult to download and change the version to keep the full stack flow.

Requesting that the primary template is updated to reference a current version of Python.

To Reproduce
Deploy the automated stack and monitor the WebStack deployment.

Expected behavior
Lambda version error referencing Python3.6 where no longer supported.

Please complete the following information about the solution:
Current template on Cloudformation, 2.0.0

Region: [e.g. us-east-1]
Was the solution modified from the version published on this repository?
If the answer to the previous question was yes, are the changes available on GitHub?
Have you checked your service quotas for the sevices this solution uses?
Were there any errors in the CloudWatch Logs?

Stack creation fails with "API: s3:SetBucketEncryption Access Denied" error

Describe the bug
While launching this template, the nested stack "MieStack" fails with following error.

"API: s3:SetBucketEncryption Access Denied" for the resource Logical ID "DataplaneLogsBucket"

To Reproduce
Just launch the stack in either regions and it throws the error. As stack gets rolled back and resources deleted, I cannot proceed further.

Expected behavior
Stack should launch without any error.

Please complete the following information about the solution:
(SO0163) - aws-media-insights-engine v3.0.2 for the nested stack
(SO0042) - aws-content-analysis v2.0.0 for the root stack

Region: [us-east-1 and us-west-2]
[No] Was the solution modified from the version published on this repository?
[n/a] If the answer to the previous question was yes, are the changes available on GitHub?
[n/a] Have you checked your service quotas for the sevices this solution uses?
[No] Were there any errors in the CloudWatch Logs?

Look into pagination options for ES calls from webapp

This isn't part of your change. I noticed we are limiting the number of hits to return from Elasticsearch to 10000. If query results are truncated, we should indicate that to the viewer in the UI. I don't see that Elasticsearch _search API has a clear indication that results are truncated in the response. I think we should consider using the pagination ("scroll") in ES for our queries.

Originally posted by @aburkleaux-amazon in aws-solutions/media-insights-on-aws#76

prevent tabnabbing

Protect link opening calls from reverse tabnabbing attack by adding "noopener,noreferer" properties to the window opener.

Add front-end tests for image workflow

Add front-end validation for image assets. Currently, we only validate video assets.

Extend dataplane to support sorting by "media type"

Make "media type" a sortable key in the dataplane, so we can query the master asset list with a filter that returns only those assets which match said media type (such as, all assets which are videos, or all assets which are face images, etc).

Index filenames so they can be searched

Users need to be able to find assets by searching for their filenames. Currently they can't do that.

textDetectionImage fails to process screenshot images on Mac

Steps to reproduce:

take a screenshot on a Mac
upload said png to front-end with textDetectionImage enabled
observe failure in textDetectionImage operator

Workflows fail when video filenames contain certain nonalphanumeric characters

Operators in defaultVideoStage fail to get files from s3. They will throw an InvalidS3ObjectException. Here are some offending filenames:

[amazon]–{3 0}–(a+d)proxy.mp4
[a]{m}(a)z,o-n_3+0 ad_proxy.mp4
[AWS][00.00.30] Summit San Francisco 2018 Keynote with Dr. Werner Vogels + Dr. Matt Wood_proxy.mp4

Elasticsearch consumer failing is with incompatibility error

I just started noticing that the data pipeline is no longer sending data to Elasticsearch. Digging into the logs, the bulk load function is failing with this error:

Unable to load data into es: The client noticed that the server is not a supported distribution of Elasticsearch

I'm seeing this happening with AWS Elasticsearch Service version 7.10

Elasticsearch write rejection

I'm getting the following error in the Elasticsearch consumer lambda (line 811) for long videos that generated paged results from Rekognition.

Unable to load data into es: TransportError(429, '429 Too Many Requests /mielabels/_bulk')

The results in the UI are clearly missing paged results:

Anonymous data is not anonymous

The anonymous data logger fails to redact the account id. Here's a sample of the data it reports:

{
    "Solution": "SO0163",
    "UUID": "78305d21-fa03-440b-bb15-11416f4213d3",
    "TimeStamp": "2021-02-24T23:23:15.476092",
    "Data": {
        "ServiceToken": "arn:aws:lambda:us-west-2:[account_id]:function:mie03a-anonymous-data",
        "SolutionId": "SO0163",
        "Version": "1.0",
        "Resource": "AnonymousMetric",
        "UUID": "78305d21-fa03-440b-bb15-11416f4213d3"
    }
}

aws-content-analysis-use-existing-mie-stack.yaml does not deploy

aws-content-analysis-use-existing-mie-stack.yaml does not deploy.

Exception during processing: 'An error occurred (AccessDeniedException) when calling the CreateStateMachine operation: User: arn:aws:sts::<accountid>:assumed-role/WorkflowCustomResourceRole/M3-MediaInsightsWorkflowApi-WorkflowCustomResource-JQ8PMRqATfB0 is not authorized to perform: iam:PassRole on resource: arn:aws:iam::<accountid>:role/CustomerManaged/M3-StepFunctionRole-QC7QPYKFEZLE

Allow source language selection for transcribe

The upload form needs to allow users to select a source language for the transcribe operator. It currently assumes that is english.

Stagger workflow status update requests

Every workflow listed in the the Upload view will send a request to get its status at the same time. This won't scale very well. Something should be done to either stagger all the workflow status update requests, or somehow not run them all at the same time.

To reproduce this issue, drag and drop about a 100 videos into the Upload view, the click the Upload and Run workflow, then open firefox/chrome dev tools --> Network tab, and note how many HTTP requests continuously run.

add support for new languages and variants

Some of the languages supported by transcribe and translate are not listed as options in the workflow configuration form shown on the Upload view. Please check with the service docs and update those language lists accordingly.

Transcribe fails for full length videos

The transcribe stage of the video workflow fails when processing long videos (> 1 hour) with the following error:

...r', 'MetaData': {'transcribe_error': 'An error occurred (BadRequestException) when calling the StartTranscriptionJob operation: The input file that you specified exceeds the maximum size of 2048.00 Mb. Try again with a smaller file.'}, 'Media': {}}\", \"errorType\": \"MasExecutionError\", \"stackTrace\": [\" File \\\"/var/task/start_transcribe.py\\\", line 91, in lambda_handler\\n raise MasExecutionError(operator_object.return_output_object())\\n\"]}"

Update per Amazon Elasticsearch to Amazon OpenSearch Service

Amazon Elasticsearch Service will be renamed to Amazon OpenSearch Service [reference], so we should update our docs accordingly. Also need to replace es:DescribeElasticsearchDomain with es:DescribeDomain here:
https://github.com/awslabs/aws-content-analysis/blob/593c9647ffea7957b9f141fb4f0a8f532f4418e8/deployment/aws-content-analysis-elasticsearch.yaml#L173
and here:
https://github.com/awslabs/aws-content-analysis/blob/593c9647ffea7957b9f141fb4f0a8f532f4418e8/deployment/aws-content-analysis-elasticsearch.yaml#L175

show workflow configuration history on analysis view

Show the history of enabled/disabled operators for an asset in the analytics view, so users know what workflow configurations were run.

Front end replacements cannot load data for old assets.

If you process assets then replace the front end, the new front end deployment will not be able to load data for the old assets.

Steps to reproduce:

Deploy back-end
Deploy front-end
Upload assets
Delete front-end deployment
Deploy front-end again
And observe that you can't see data for previously uploaded assets.

UnreservedConcurrentExecution

When I tried to create the stack, I got the following message on Event Pane at AWS Cloudformation:

"Resource handler returned message: "Specified ReservedConcurrentExecutions for function decreases account's UnreservedConcurrentExecution below its minimum value of [10]. (Service: Lambda, Status Code: 400, Request ID: 4976c838-b25a-4cbd-b669-3419f135b49d)" (RequestToken: 1a392659-ca8d-b5dc-ec6e-461411f81d06, HandlerErrorCode: InvalidRequest)"

and the stack is rollbacked, then.

To Reproduce
Launch the stack, as indicated on Read.me

Region: [N.Virginia. us-east-1]
Was the solution modified from the version published on this repository? No changes were made to the stack.

Conflict with

This resource WorkflowExecutionLambdaDeadLetterQueue has a fixed name and it's in conflict with Media Insight Engine

WorkflowExecutionLambdaDLQ already exists in stack arn:aws:cloudformation:us-east-1:xxxx:stack/mie202003-3/cd9rrrr

Bulk delete from MIE Webapp

It would be helpful to be able to bulk delete assets from the collection view

OpenSearch Consumer Lambda [ERROR] Runtime.ImportModuleError: cannot import name 'RequestsHttpConnection' from 'elasticsearch'

Describe the bug
The OpenSearch Consumer Lambda is throwing an Import Error when invoked when using elasticsearch==8.6.0 in requirements.txt

To Reproduce
Have deployed from repo and rebuilt from source with the same result.

Expected behavior
Consumer Lambda should execute successfully.

Please complete the following information about the solution:

[aws-media-insights-engine version v5.0.0]
[All] Region: [e.g. us-east-1]
[ No ] Was the solution modified from the version published on this repository?
[ NA ] If the answer to the previous question was yes, are the changes available on GitHub?
[ Yes ] Have you checked your service quotas for the sevices this solution uses?
[ Yes ] Were there any errors in the CloudWatch Logs?

[ERROR] Runtime.ImportModuleError: Unable to import module 'lambda_handler': cannot import name 'RequestsHttpConnection' from 'elasticsearch' (/var/task/elasticsearch/init.py) Traceback (most recent call last):
Screenshots
If applicable, add screenshots to help explain your problem (please DO NOT include sensitive information).

Additional context
Add any other context about the problem here.

RequestsHttpConnection looks to be deprecated in Elasticsearch 8.0+ in favour of using RequestsHttpNode from elastic_transport.

Provide sample curl command for running the Image workflow

Currently the Upload view only shows a sample curl command for the video workflow. Allow users to also see the curl command for the Image workflow.

Separate subtitle edit "save" and "trigger workflow" buttons -> prevent editing freeze for 60+ minutes

Saving edits to subtitles automatically triggers a new workflow, freezing the editing mode for e.g. 60+ minutes for a typical 1-hour video.

This can be inconvenient because:

Editors often notice final fixes just after saving.
Editors want to save their changes frequently in order to prevent any loss of their progress

It would be helpful to separate the "save" button, for saving the edits, from the "trigger workflow" button.

Remove profanity checker

Describe the bug
The bad-words package used to detect profanity in the transcript only supports english, and consequently causes confusion when it falsely labels non-english transcripts as profane. Therefore, the profanity checking feature should be removed.

WebsiteDeployHelper - Runtime Python3.6

Describe the bug
WebsiteDeployHelper is failing to create. Embedded stack arn:aws:cloudformation:us-east-1:xxxxxxxx:stack/ContAnalyz-WebStack-NAV0C7BT3YE7/8ea78950-18f2-11ed-a065-0e987b47ea9f was not successfully created: The following resource(s) failed to create: [WebsiteDeployHelper].

Resource handler returned message: "The runtime parameter of python3.6 is no longer supported for creating or updating AWS Lambda functions. We recommend you use the new runtime (python3.9) while creating or updating functions. (Service: Lambda, Status Code: 400, Request ID: 9c7293a7-99d7-4e54-992c-0a00d3fc612b)" (RequestToken: 88e80d66-96e9-b03f-ce72-c97a67334ed9, HandlerErrorCode: InvalidRequest)

To Reproduce
Create stack with aws-content-analysis template.

Expected behavior
Creation of WebStack resource.

Please complete the following information about the solution:

[v2.0.0] Version: [e.g. v1.0.0]
[us-east-1] Region: [e.g. us-east-1]
[No] Was the solution modified from the version published on this repository?
If the answer to the previous question was yes, are the changes available on GitHub?
[Yes] Have you checked your service quotas for the sevices this solution uses?
[See bug description] Were there any errors in the CloudWatch Logs?

issue with adding new subtitles into livestream

I am trying to add new subtitles like Italian and Arabic subtitles to live output
Hello

I added the new subtitles on Output Groups by cloning Spanish output and changed some fields according to the below image

after I play the live its show the caption and when I click on the caption it does not show the translated subtitle on the live stream
please help i am new to this

Document max stack name length in readme

The Cloud Formation stack name must be 12 or fewer characters long. (Check this).

Stream media from cloudfront rather than dataplane s3 bucket directly

.//deployment/media-insights-stack.yaml:605: # TODO: Update cors config per: https://docs.amplify.aws/lib/storage/getting-started/q/platform/js#using-amazon-s3

Create Github Action to validate the published CAS solution

Create "NightsWatch"-esque Github Actions test job for CAS

index filename to elasticsearch

Add filename and created timestamp to elasticsearch so users can find assets by searching for the filename or created timestamp, like this:

aws-solutions/content-localization-on-aws#249

Cognito - Google signon does not work

Describe the bug

Successfully created the stack and I can access the cloudfront Url without any issues.

Followed the instructions (listed below) to integrate the cloudfront url with google authentication via AWS cognito. After creating the user pool within cognito and setting up the google project within the developer account. The hosted UI within AWS cognito redirected me to the google page and the URL redirects with the cloudfront url + the token but prompts me for username and password.

https://aws.amazon.com/premiumsupport/knowledge-center/cognito-google-social-identity-provider/

To Reproduce

Click on create stack - https://github.com/aws-solutions/content-analysis-on-aws
Once the stack gets created successfully then output tab within the stack provides cloudfront URL and also creates a cognito user pool.
Follow the instructions listed in https://aws.amazon.com/premiumsupport/knowledge-center/cognito-google-social-identity-provider/ URL to configure google OAuth.
After cognito google auth is complete the hosted UI will be available within cognito - app client settings.
Click on hosted UI and then you'll be redirect to google auth but after selecting your gmail the page gets redirected to cloudfront landing page instead of creating the user account and logging in directly to the website.

FYI - Tried deploying this stack in different regions numerous times and I still see the same issue. I believe the application is not storing the access and token ID or something which is not letting the website to authentication.

Expected behavior
After google authentication AWS content analysis page redirects to the login page instead of using google / gmail credentials to log directly into the AWS content analysis page

Please complete the following information about the solution:

Version 1.0.0 (deployed)
Tried Version 2.0.0 deployment and the build failed.

Mappings:
  SourceCode:
    General:
      S3Bucket: "solutions"
      KeyPrefix: "video-on-demand-on-aws/v5.0.0"

Region: us-east-1
Was the solution modified from the version published on this repository? No
If the answer to the previous question was yes, are the changes available on GitHub?
Have you checked your service quotas for the sevices this solution uses? Yes
Were there any errors in the CloudWatch Logs? No

CREATE_FAILED with "Domain cannot contain reserved word: aws"

tl;dr - don't let your stack name contain "aws"

The MieCognitoDomain step had a with "Domain cannot contain reserved word: aws (Service: AWSCognitoIdentityProviderService; Status Code: 400; Error Code: InvalidParameterException; Request ID: [redacted]; Proxy: null)"

I found the error in CloudTrail CreateUserPoolDomain event:

    "eventName": "CreateUserPoolDomain",
    "errorCode": "InvalidParameterException",
    "errorMessage": "Domain cannot contain reserved word: aws",
    "requestParameters": {
        "userPoolId": "us-east-1_HYsU8EgOr",
        "domain": "aws-content-analysis-dataplane-1hocmg2w2zumk"
    },

And I realized when naming my stack, I choose "aws-content-analysis" and AWS::StackName was implicitly used at some point.

I'm not sure if any check can be done, but writing this issue in case anyone else hits this problem and searches for the error message.

Update S3 write scripts to check account ownership before write

Update script statements like this:
aws s3 sync $global_dist_dir s3://$global_bucket/aws-media-insights-engine/$version/
aws s3 sync $regional_dist_dir s3://${regional_bucket}-${region}/aws-media-insights-engine/$version/

To include checks like this before running s3 sync or cp:
aws s3api head-bucket --bucket $global_bucket --expected-bucket-owner $bucket_account
aws s3api head-bucket --bucket $regional_bucket --expected-bucket-owner $bucket_account

The head-bucket command will return a non-zero result (API returns a 403) if the bucket ownership doesn’t match. If you have error handling set to short-circuit the script, the above statements would stop the script before uploading.

Directly uploading to regional buckets means checking each bucket before uploading.

Lambda Concurrent Executions Limitation on New Accounts

Is your feature request related to a problem? Please describe.
During deployment of the automated stack, on new or lightly used testing accounts, I was receiving an error in deploying the MieStack (WorkflowSchedulerLambda) as the minimum concurrent executions Service Quota set in my account was blocking it. It would be nice to list this as a common issue to smooth out deployments in the future.

In instructions it would be helpful to list that this is automatically increased as the AWS account is used OR a Service Quota increase needs to be requested in the account to a minimum of 1,000 Applied Quota Value in the AWS account.

Kibana instructions need to be updated for new OpenSearch UX

The instructions for enabling Kibana in the README are slightly different now with the new Opensearch service.

Email not received - credentials for accessing the web application

Hello,

I am new to AWS and tried this project, but unfortunately, I did not receive any email with the URL to the website. Please see attached screenshot of the stack status. I followed all the instructions on git for your project, but there was an issue on the stack deployed on my CloudFormation > Stacks page on the Console. Can you please help on how I can figure out why the Website URL was not emailed to me even though I created the EMAIL variable?

Thank you.

transcribe_error : The input file that you specified exceeds the maximum size of 2048.00 Mb. Try again with a smaller file

error with big files on ‘AWS Content Analysis’ is due to the transcription
"'transcribe_error': 'An error occurred (BadRequestException) when calling the StartTranscriptionJob operation: The input file that you specified exceeds the maximum size of 2048.00 Mb. Try again with a smaller file.'}"

Step output
{
"Name": "TranscribeVideo",
"AssetId": "f4377f80-9ebd-40f5-b886-1d6e12d42734",
"WorkflowExecutionId": "b4b6ea77-6523-4dac-9b30-8a9a80598a5f",
"Input": {
"Media": {
"Thumbnail": {
"S3Bucket": "",
"S3Key": ""
},
"Audio": {
"S3Bucket": "",
"S3Key": ""
},
"Video": {
"S3Bucket": "",
"S3Key": ""
},
"ProxyEncode": {
"S3Bucket": "",
"S3Key": "private/assets/"
}
},
"MetaData": {
"MediaconvertJobId": "1673454776422-2v7ebs",
"AssetId": "f4377f80-9ebd-40f5-b886-1d6e12d42734",
"Mediainfo_num_audio_tracks": "1",
"WorkflowExecutionId": "b4b6ea77-6523-4dac-9b30-8a9a80598a5f",
"MediaconvertInputFile": "",
"JobId": "23c48ab517beac038cd8a2f736298a17890dc0ee62bc4d3472d05c2875c2123a",
"PageToken": ""
}
},
"Configuration": {
"MediaType": "Video",
"Enabled": true,
"TranscribeLanguage": "en-US"
},
"Status": "Started",
"MetaData": {},
"Media": {},
"Outputs": {
"Error": "MasExecutionError",
"Cause": "{"errorMessage": "{'Name': 'TranscribeVideo', 'AssetId': 'f4377f80-9ebd-40f5-b886-1d6e12d42734', 'WorkflowExecutionId': 'b4b6ea77-6523-4dac-9b30-8a9a80598a5f', 'Input': {'Media': {'Thumbnail': {'S3Bucket': '', 'S3Key': ''}, 'Audio': {'S3Bucket': '', 'S3Key': '**'}, 'Video': {'S3Bucket': '', 'S3Key': ''}, 'ProxyEncode': {'S3Bucket': '*****', 'S3Key': ''}}, 'MetaData': {'MediaconvertJobId': '1673454776422-2v7ebs', 'AssetId': 'f4377f80-9ebd-40f5-b886-1d6e12d42734', 'Mediainfo_num_audio_tracks': '1', 'WorkflowExecutionId': 'b4b6ea77-6523-4dac-9b30-8a9a80598a5f', 'MediaconvertInputFile': 's3:// **********************', 'JobId': '23c48ab517beac038cd8a2f736298a17890dc0ee62bc4d3472d05c2875c2123a', 'PageToken': '*****vkHEav0/tUqbsuG29Rtaf7o8MveOm3HGnOyD4NLE60zxO2MG+8'}}, 'Configuration': {'MediaType': 'Video', 'Enabled': True, 'TranscribeLanguage': 'en-US'}, 'Status': 'Error', 'MetaData': {'transcribe_error': 'An error occurred (BadRequestException) when calling the StartTranscriptionJob operation: The input file that you specified exceeds the maximum size of 2048.00 Mb. Try again with a smaller file.'}, 'Media': {}}", "errorType": "MasExecutionError", "stackTrace": [" File \"/var/task/start_transcribe.py\", line 91, in lambda_handler\n raise MasExecutionError(operator_object.return_output_object())\n"]}"
}
}

Investigate required steps to get CAS working with MIE CMK

MIE has added a stack level CMK that encrypts all services. CAS needs to be able to work with this key.

Confidence value has no impact on search queries

Describe the bug
For the "Analyze" functionality, the search query in the tabs does not adjust when changing the confidence value. Additionally, the API calls in the modal shown ("Show API request to get these results").

To Reproduce

Deploy the solution
Upload an image
Click "Analyze" in the "Collections" view after the processing of the image has finished
Change the "Confidence" value through the slider and observe that the result set isn't changing. Observe through Developer Tools that the query sent always contain a "Confidence" value of 90.
Click "Show API request to get these results" and observe that the Confidence value remains 90 despite the slider value
Scroll down in the modal to the "awscurl" command and observe that the URL starts with "undefined"

Expected behavior

Expect the results to change when the confidence value is adjusted.
Expect to see an executable awscurl command in the modal but not "undefined"

Please complete the following information about the solution:

[v2.0.0] Version (CFN deployed at 2021-11-18)
[us-east-1 ] Region
[no modifications ]
[ irrelevant for the bug ]
[ no, but in the browser console ]

Screenshots

Additional context
There are two reasons for the bugs:

the Vue components in question are referring to "ELASTICSEARCH_ENDPOINT" but the mixin ingesting the config variables write the endpoint to "SEARCH_ENDPOINT"
the Vue components correctly calls the updateConfidence method but this method doesn't update this.searchQuery to which the fetchAssetData relates. The fetchAssetData also defines an unused variable called query instead.

PR on the way.

CREATE_FAILED - WorkflowSchedulerLambda failed to create

Using the Cloudformation templates provided (for either US East or US West regions) - a CREATE_FAILED error occurs with the status reason below( I replaced some text with 12345) :

"""Embedded stack arn:aws:cloudformation:us-west-2:12345:stack/cas-MieStack-12345 was not successfully created: The following resource(s) failed to create: [WorkflowSchedulerLambda]"""

Any guidance as to how to correct this error?

Analyze shows 0 results

Describe the bug
After uploading a photo, the system is not detecting objects (or anything) in the photos.

To Reproduce
Deploy the latest stack and upload a photo

Expected behavior
An uploaded photo should have something detected in it.

Please complete the following information about the solution:

Version: v2.0.2
Region: [e.g. us-east-1]
Was the solution modified from the version published on this repository?
If the answer to the previous question was yes, are the changes available on GitHub?
Have you checked your service quotas for the sevices this solution uses?
Were there any errors in the CloudWatch Logs?

Screenshots

Additional context
{ "ServiceToken": "arn:aws:lambda:us-east-1:XXXXXXXX:function:sponsorship-1-MieStack-1F9A-WorkflowCustomResource-kLFTnccUKcqZ", "ApiVersion": "3.0.0", "Configuration": { "MediainfoImage": { "MediaType": "Image", "Enabled": true } }, "Version": "v0", "Next": "RekognitionStage", "Definition": "{\"StartAt\": \"ValidationStage\", \"States\": {\"Complete Stage ValidationStage\": {\"Type\": \"Task\", \"Resource\": \"arn:aws:lambda:us-east-1:XXXXXXXX:function:sponsorship-1-MieStack-1F9AC7C-CompleteStageLambda-kvTnsCHAIKLY\", \"End\": true}, \"ValidationStage\": {\"Type\": \"Parallel\", \"Next\": \"Complete Stage ValidationStage\", \"ResultPath\": \"$.Outputs\", \"Branches\": [{\"StartAt\": \"Filter MediainfoImage Media Type? (ValidationStage)\", \"States\": {\"Filter MediainfoImage Media Type? (ValidationStage)\": {\"Type\": \"Task\", \"Parameters\": {\"StageName.$\": \"$.Name\", \"Name\": \"MediainfoImage\", \"Input.$\": \"$.Input\", \"Configuration.$\": \"$.Configuration.MediainfoImage\", \"AssetId.$\": \"$.AssetId\", \"WorkflowExecutionId.$\": \"$.WorkflowExecutionId\", \"Type\": \"Image\", \"Status\": \"$.Status\"}, \"Resource\": \"arn:aws:lambda:us-east-1:XXXXXXXX:function:sponsorship-1-MieStack-1F9AC-FilterOperationLambda-m6czjgSfzGeV\", \"ResultPath\": \"$.Outputs\", \"OutputPath\": \"$.Outputs\", \"Next\": \"Skip MediainfoImage? (ValidationStage)\", \"Retry\": [{\"ErrorEquals\": [\"Lambda.ServiceException\", \"Lambda.AWSLambdaException\", \"Lambda.SdkClientException\", \"Lambda.Unknown\", \"MasExecutionError\"], \"IntervalSeconds\": 2, \"MaxAttempts\": 2, \"BackoffRate\": 2}], \"Catch\": [{\"ErrorEquals\": [\"States.ALL\"], \"Next\": \"MediainfoImage Failed (ValidationStage)\", \"ResultPath\": \"$.Outputs\"}]}, \"Skip MediainfoImage? (ValidationStage)\": {\"Type\": \"Choice\", \"Choices\": [{\"Variable\": \"$.Status\", \"StringEquals\": \"Started\", \"Next\": \"Execute MediainfoImage (ValidationStage)\"}], \"Default\": \"MediainfoImage Not Started (ValidationStage)\"}, \"MediainfoImage Not Started (ValidationStage)\": {\"Type\": \"Succeed\"}, \"Execute MediainfoImage (ValidationStage)\": {\"Type\": \"Task\", \"Resource\": \"arn:aws:lambda:us-east-1:XXXXXXXX:function:sponsorship-1-MieStack-1F9AC7CV61HAR-Ope-Mediainfo-6RXrQy9bqx6z\", \"ResultPath\": \"$.Outputs\", \"OutputPath\": \"$.Outputs\", \"Next\": \"Did MediainfoImage Complete (ValidationStage)\", \"Retry\": [{\"ErrorEquals\": [\"Lambda.ServiceException\", \"Lambda.AWSLambdaException\", \"Lambda.SdkClientException\", \"Lambda.Unknown\", \"MasExecutionError\"], \"IntervalSeconds\": 2, \"MaxAttempts\": 2, \"BackoffRate\": 2}], \"Catch\": [{\"ErrorEquals\": [\"States.ALL\"], \"Next\": \"MediainfoImage Failed (ValidationStage)\", \"ResultPath\": \"$.Outputs\"}]}, \"Did MediainfoImage Complete (ValidationStage)\": {\"Type\": \"Choice\", \"Choices\": [{\"Variable\": \"$.Status\", \"StringEquals\": \"Complete\", \"Next\": \"MediainfoImage Succeeded (ValidationStage)\"}], \"Default\": \"MediainfoImage Failed (ValidationStage)\"}, \"MediainfoImage Failed (ValidationStage)\": {\"Type\": \"Task\", \"End\": true, \"Resource\": \"arn:aws:lambda:us-east-1:XXXXXXXX:function:sponsorship-1-MieStack-1F9AC7-OperatorFailedLambda-uh9T2Nhcsokc\", \"ResultPath\": \"$\", \"Retry\": [{\"ErrorEquals\": [\"Lambda.ServiceException\", \"Lambda.AWSLambdaException\", \"Lambda.SdkClientException\", \"Lambda.Unknown\", \"MasExecutionError\"], \"IntervalSeconds\": 2, \"MaxAttempts\": 2, \"BackoffRate\": 2}]}, \"MediainfoImage Succeeded (ValidationStage)\": {\"Type\": \"Succeed\"}}}], \"Catch\": [{\"ErrorEquals\": [\"States.ALL\"], \"Next\": \"Complete Stage ValidationStage\", \"ResultPath\": \"$.Outputs\"}]}}}", "ResourceType": "STAGE", "Id": "9ecea691-e079-4f8f-98ea-f2f034ac2860", "Operations": [ "MediainfoImage" ], "Name": "ValidationStage", "Created": "1681605982.393308", "Status": "Started", "Metrics": {}, "AssetId": "06960104-e3cb-4ab8-ab33-2144a25c9728", "WorkflowExecutionId": "130d7b21-c8f3-47c7-ab4b-0c1c1fc36e7a", "MetaData": {}, "Input": { "Media": { "Image": { "S3Bucket": "sponsorship-1-miestack-1f9ac7cv61har-dataplane-n2lvkmvmtt1s", "S3Key": "public/upload/PXL_20221231_002325716.MP (1).jpg" } }, "MetaData": {} } }

build.sh fails with ValidationError due to yaml not well-formed in aws-content-analysis-use-existing-mie-stack.yaml

The file cloudformation/aws-content-analysis-use-existing-mie-stack.yaml is not well formed due to merge conflicts in lines 29, 64, 99, 171 and 205. As a consequence, build.sh fails when deploying with existing MIE stack with the following error message:

An error occurred (ValidationError) when calling the CreateStack operation: Template format error: YAML not well-formed. (line 30, column 1)

Failed workflows block other workflows from running

Failed workflows can get stuck in a "Started" state. This causes the concurrent execution queue to fill up with failed workflows that look like they're "Started" but are actually failed. You can see their real status if you look at their state machine in AWS Step Functions.

To clear the concurrent execution queue, open the content-analysisWorkflowExecution table in DynamoDB and remove one or more of the records for assets which are stuck in a Started status, like this:

Support automatically processing a wider range of MXF audio channel configurations

MXF could contain multiple channels that use different configurations for different purposes. In order to automatically process these inputs, MIE should use MediaInfo to analyze the configuration of the packaged media and then adjust the settings for the MediaConvert proxy encode to extract the correct audio for downstream content analysis.

So, here are a few common scenarios:

MIE works:

Channel 1 to 6 are 5.1 channels.  this case you would be ok.

MIE doesn't work:

Channel 1 & 2 are L & R; Channel 3 & 4 are Audio Description tracks
Channel 1 & 2 are English; Channel 3 &4 are Spanish  NG
I also came across file that contains test tone signal track (for timing, I believe)  NG

MediaConvert downmix channels into stereo by default. As you can imagine, you get very mixed up stereo output and that would affect the rest of the workflow like Transcribe and Comprehend.

GUI should implement elasticsearch paging in order to fetch more than 10,000 records

GUI should implement elasticsearch paging in order to fetch more than 10,000 records. Right now it's capped at 10,000, and it's not deterministic which 10,000 records ES will return.

For example, if you upload a long movie, like a 1 hour movie, then you'll see different result sets under label detection every time you refresh the analysis page.

Transcribe operator fails for videos without an audio track

Transcribe operator fails for videos without an audio track. This also causes workflows to fail.

aws-solutions / content-analysis-on-aws Goto Github PK

content-analysis-on-aws's People

Contributors

Stargazers

Watchers

Forkers

content-analysis-on-aws's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs