slifty / tvkitchen-newsmax-implementation Goto Github PK
View Code? Open in Web Editor NEWAn example implementation of a TVKitchen Countertop to scrape newsmax captions
License: GNU Affero General Public License v3.0
An example implementation of a TVKitchen Countertop to scrape newsmax captions
License: GNU Affero General Public License v3.0
As mentioned in #5 there is a desire to have the NewsMax captions stored in 30 minute chunks and uploaded to S3.
There is an example of the file segmentation in the Center for Cooperative Media TVK implementation.
On the off chance the HLS url doesn't work our TVK instance shouldn't crash.
TV Kitchen should in general get better about handling appliance errors, but in the mean time let's just make our appliance a bit more graceful.
Facing this issue when invoking an AWS command:
/usr/src/app/node_modules/aws-sdk/lib/services/s3.js:711
tvkitchen-newsmax-implementation-newsmax-1 | resp.error = AWS.util.error(new Error(), {
tvkitchen-newsmax-implementation-newsmax-1 | ^
tvkitchen-newsmax-implementation-newsmax-1 |
tvkitchen-newsmax-implementation-newsmax-1 | RequestTimeTooSkewed: The difference between the request time and the current time is too large.
tvkitchen-newsmax-implementation-newsmax-1 | at Request.extractError (/usr/src/app/node_modules/aws-sdk/lib/services/s3.js:711:35)
tvkitchen-newsmax-implementation-newsmax-1 | at Request.callListeners (/usr/src/app/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
tvkitchen-newsmax-implementation-newsmax-1 | at Request.emit (/usr/src/app/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
tvkitchen-newsmax-implementation-newsmax-1 | at Request.emit (/usr/src/app/node_modules/aws-sdk/lib/request.js:686:14)
tvkitchen-newsmax-implementation-newsmax-1 | at Request.transition (/usr/src/app/node_modules/aws-sdk/lib/request.js:22:10)
tvkitchen-newsmax-implementation-newsmax-1 | at AcceptorStateMachine.runTo (/usr/src/app/node_modules/aws-sdk/lib/state_machine.js:14:12)
tvkitchen-newsmax-implementation-newsmax-1 | at /usr/src/app/node_modules/aws-sdk/lib/state_machine.js:26:10
tvkitchen-newsmax-implementation-newsmax-1 | at Request.<anonymous> (/usr/src/app/node_modules/aws-sdk/lib/request.js:38:9)
tvkitchen-newsmax-implementation-newsmax-1 | at Request.<anonymous> (/usr/src/app/node_modules/aws-sdk/lib/request.js:688:12)
tvkitchen-newsmax-implementation-newsmax-1 | at Request.callListeners (/usr/src/app/node_modules/aws-sdk/lib/sequential_executor.js:116:18) {
tvkitchen-newsmax-implementation-newsmax-1 | code: 'RequestTimeTooSkewed',
tvkitchen-newsmax-implementation-newsmax-1 | region: null,
tvkitchen-newsmax-implementation-newsmax-1 | time: 2022-08-10T02:00:52.544Z,
tvkitchen-newsmax-implementation-newsmax-1 | requestId: 'X41BC24VNWRT3901',
tvkitchen-newsmax-implementation-newsmax-1 | extendedRequestId: 'vswT/eism/4BKPLLEl0oOXYXuFEt4OCCOcHVn9HOrH+cDgFr5rM+ppcSqhsstclIndYZZ87vb9c=',
tvkitchen-newsmax-implementation-newsmax-1 | cfId: undefined,
tvkitchen-newsmax-implementation-newsmax-1 | statusCode: 403,
tvkitchen-newsmax-implementation-newsmax-1 | retryable: false,
tvkitchen-newsmax-implementation-newsmax-1 | retryDelay: 11.418787125254038
tvkitchen-newsmax-implementation-newsmax-1 | }
The TextReducerAppliance
uses a pretty naive algorithm for splitting TEXT.CUE objects into atomic parts. It only looks at the last line of a cue, and it compares the last line of the current cue with the last line of the previous cue.
This is a problem if:
This is a tricky problem space, but once solved it will be a really useful tool for other TV Kitchen implementations as well.
The ultimate goal is to be able to run docker start
and have this utility start spitting out caption files.
Our discussion last month yielded the following spec:
I believe the target destination is AWS in this case, as opposed to a local filesystem.
The AWS uploader appliance right now requires config to be passed to it (or rather, it defaults to an empty string).
AWS provides a few ways to configure, and we shouldn't be so opinionated. For instance: there are some cases that require more to be set than what the appliance supports.
I'd like to modify the appliance to make the AWS auth config values optional, and then in this case not actually pass that config to the appliance so AWS can just use environment variables.
We want the following to be built into git workflows:
(We don't have tests, RIP)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.