w3c / mediacapture-worker Goto Github PK
View Code? Open in Web Editor NEWMediaStream with worker
License: Other
MediaStream with worker
License: Other
@ChiahungTai mentioned this issue and open for discussion here.
This property should not have a default value.
As suggested by @ChiahungTai in #19 (comment), addVideoMonitor/Processor
and removeVideoMonitor/Processor
methods should return promises.
Merge ImageBitmap-extension into this specification.
The convention is that each event type comes with its own event handler. I think we should add onvideomonitor
:
partial interface DedicatedWorkerGlobalScope {
attribute EventHandler onvideomonitor;
attribute EventHandler onvideoprocess;
};
This is the usual boilerplate specifications tend to use, adapted to this spec:
The following are the event handlers (and their corresponding event handler event types) that must be supported, as event handler IDL attributes, by objects implementing the DedicatedWorkerGlobalScope interface:
Event handler Event handler event type onvideomonitor videomonitor onvideoprocess videoprocess
The definition of HSV and Lab are usually defined in floating point domain, I will then change the data type of HSV and Lab form uint8
to float32
.
@rocallahan suggested that rename the ImageFormatPixelLayout
to ImagePixelLayout
because the original one is too redundant.
@smaug---- suggested using WebIDL dictionary for ChannelPixelLayout
and then the Image(Format)PixelLayout
becomes just sequence<ChannelPixelLayout>
. (maybe using a typedef
)
Since the VideoProcessorEvent inherits VideoMonitorEvent, VideoProcessorEvent shouldn't
redefine the properties in VideoMonitorEvent.
@ChiahungTai pointed out a problem in the sample code that, in the VideoProcessor case, the VideoProcessorEvent::outputImageBitmap
will be set at the promise callback of createImageBitmap()
method which means that while the VideoProcessorEvent::outputImageBitmap
is being set, the VideoProcessorEvent
has already been returned and the framework/User Agent got a NULL.
Looking into the counterpart part of WebAudio, the AudioWorker. The onaudioprocess
event is restricted to do processing synchronously.
The restriction of synchronous processing leads to simpler APIs and framework design, however, since we are going to enable processing on the main thread #30, is such restriction still suitable? Also, if we are going to limit the processing to be synchronous, we need to redesign the ImageBitmapFactories::createImageBitmap()
to a synchronous flavor.
On the other hand, if we allow asynchronous processing, then we need a mechanism to let the browser know that the VideoProcessorEvent::outputImageBitmap
is ready for rendering and before that, the browser should not go further.
@rocallahan, @mhofman, @anssiko, may I know your opinions?
As title, the ImageBitmapFactory::createImageBitmap(FromBufferSource) method should throw if the given format is not supported by the user agent. Just like the mechanism of ImageBitmap::mapDataInto().
Before I can proceed #23, I would like to introduce constructors to both ChannelPixelLayout
and ImageFormatPixelLayout
. Because it is very likely that the formats of VideoProcessorEvent::inputImageBitmap
and VideoProcessorEvent::outputImageBitmap
are different. So, if users have to use ImageBitmapFactories::createImageBitmap(FromSourceBuffer)
, they need a way to create ChannelPixelLayout
and ImageFormatPixelLayout
objects. Thoughts? @anssiko and @ChiahungTai.
In issue #30, @mhofman proposed to decouple monitor/processor from Worker and this issue was discussed in the TPAC2015, mediacapture-worker Ad-hoc meeting.
During the meeting @padenot mentioned that WebAudio is considering to adopting Worklet (a.k.a. IsolatedWorker and Processor) to re-design the AudioWorker and he suggested the video processing also considering it.
To my understanding, correct me if I was wrong, CSS-Worklet and WebAudio shares the same requirements[1][2][3] that are:
(1) Light weight. The WebWorker is too heavy to create and initialize.
(2) Thread-agnostic. The processing script should be run on any thread.
(3) Clean/Safe API surface. To prevent dangerous API call, e.q. setInterval().
(4) Hooked-based callbacks. Not event-based, the processing script is not ran continuously.
For the WebAudio, there is also another requirement that (5) all the node of the same AudioContext should be ran on the same thread which prevents them to implement the AudioWorker by WebAudio.[3]
In issue #30, our target is (2) and other items, from my personal point of view, seem to be not necessary to our scenario, thoughts?
[1] TPAC2015-WebAudio-meeting-minutes-day-1
[2] WebAudio mailing list - New name for "AudioWorker"
[3] WebAudio/web-audio-api#532
It is not trivial how to crop YUV422/YUV420 data if the cropping area starts with odd x or y coordinate.
For example:
// Give an YUV420P data with the following pixel layout.
.-------------------.-------------------.-------------------.-------------------.
| y(0, 0) | y(1, 0) | y(2, 0) | y(3, 0) |
| u(0, 0), v(0, 0) | | u(1, 0), v(1, 0) | |
+-------------------+-------------------+-------------------+-------------------+
| y(0, 1) | y(1, 1) | y(2, 1) | y(3, 1) |
| | | | |
+-------------------+-------------------+-------------------+-------------------+
| y(0, 2) | y(1, 2) | y(2, 2) | y(3, 2) |
| u(0, 1), v(0, 1) | | u(1, 1), v(1, 1) | |
+-------------------+-------------------+-------------------+-------------------+
| y(0, 3) | y(1, 3) | y(2, 3) | y(3, 3) |
| | | | |
+-------------------+-------------------+-------------------+-------------------+
// What if developers call createImageBitmap with the above data
// and also pass a cropping rectangle starts at a odd x or y coordinate?
var p = createImageBitmap(yuv420PBuffer, ......, 1, 1, 2, 2);
Cropping u/v-channel starts from odd x or y coordinate is not trivial, here are two proposals:
(1) Via re-sampling.
We can first up-sample the u/v-channel to the same size of y-channel, do the cropping, and then down-sample aging back to the u/v-channel cropping size. However, by this way, the newly created ImageBitmap's data is not the one passed by the caller. Also, there are plenty of re-sampling methods, should we explicitly define one on the spec?
(2) Avoid it.
We can specifically define that only cropping rectangle starts form odd x and y coordinates are allowed for YUV422P/YUV420P/NV12/NV21 format.
I personally prefer the 2nd one, thoughts? @jrmuizel @rocallahan, @ChiahungTai and @anssiko.
The current example is not straightforward enough to express the main concept of this spec., we need a clear one.
@smaug---- suggested that this enumeration should be called ChannelPixelLayoutDataType
. The original name DataType
is just too generic.
The evens need a constructor.
Since both VideoMonitorEvent and VideoProcessorEvent are dispatched to worker only, we should add [Exposed=worker] to them.
@ehsan suggested that we should change the VideoProcessorEvent.outputImageBitmap
property to VideoProcessorEvent.setOutputImageBitmap(Promise<ImageBitmap> outputIB)
, which is more clear naming for the developers to follow.
So, developers use this api in the following way:
var processor = new VideoProcessor();
processor.onvideoprocess = (event) => {
/* do some processing */
event.setOutputImageBitmap(createImageBitmap( /* from the processed array buffer */);
}
We have a Note at the end of section 6 about the UA could skip frame if the users' scripts cannot consume frames in real-time. However, we should describe this mechanism in more detail.
We should add a processing model section similar to http://padenot.github.io/web-audio-api/#processing-model (currently an early draft by @padenot). Consider at least threads, message passing, event loops, asynchronous operations. I think we share much of the common ground with the Web Audio API processing model, and should reuse and adapt the model and terminology for this spec when more settled.
Example 12 uses setDataFrom()
method which is not defined in the spec. I believe this should be mapDataInto()
instead, correct @kakukogou?
// write back to event outputImageBitmap
outputBitmap.setDataFrom("RGBA32", rgbaBuffer, 0, rgbaBufferLength,
bitmap.width, bitmap.height, bitmapPixelLayout.channels[0].stride);
}
I'd like us to consider having a self contained VideoProcessor/Monitor
object that's not tied to a Worker object.
The idea is to have a processing model similar to OffscrenCanvas
, where you have an object you can use either on the main thread, or in a worker context.
Events would be dispatched on this VideoProcessor/Monitor
object instead of extending the Worker
interface. The VideoProcessor/Monitor
would be Transferable
to Workers using postMessage
.
Those objects would be created using a createVideoProcessor/Monitor
method on MediaStreamTrack
s. You can release them using a close
method.
We'd need to figure out a way to get the output MediaStreamTrack
of a VideoProcessor
, but that's a little detail.
@rocallahan suggested that the ImageBitmap::FindOptimalFormat() should not return an empty string. It should always be convertible between any format except to/from DEPTH and we could throw while converting to/from the DEPTH format.
The use of Contiguous IDL will improve the readability of the spec, and makes spec editing easier.
@mhofman mentioned this use case in the TPAC and, indeed, we are not able to cover this uses case now.
On second thought, I think this is a possible use case in offline processing but not sure if it is needed in realtime processing.
Scenario:
Users shoot a 120FPS slow motion video and would like to re-compile it into a normal 30FPS video.
Following the discussion at Mozilla bugzilla bug1141979 comment278.
@jrmuizel suggests that dropping the cropping variant of ImageBitmapFactory extensions:
[NoInterfaceObject, Exposed=(Window,Worker)]
partial interface ImageBitmapFactories {
// Keep this.
Promise<ImageBitmap> createImageBitmap(BufferSource buffer,
long offset,
long length,
ImageFormat format,
ImageFormatPixelLayout layout);
// Drop this.
Promise<ImageBitmap> createImageBitmap(BufferSource buffer,
long offset,
long length,
ImageFormat format,
ImageFormatPixelLayout layout,
long sx,
long sy,
long sw,
long sh);
};
Before, the reason why I chose to keep the copping variant is to keep the same behavior of the existing API. However, after the implementation and reviewing, we have already filed two issues, #43 and #46, to add exceptions on it. So, the extension API now has different behaviors to the existing API.
Personally, I agree to drop the cropping variant to simplification, however I still would like to listen to more thoughts on it. @rocallahan, @anssiko, @ChiahungTai and @smaug----.
We should not process data from MediaStreams which come from different origin.
This issue was mentioned in #30 and also discussed in TPAC2015.
Before that, we do have some preliminary ideas about this issue, which is going to extend the MediaStream with offline property, pleases refer to OfflineMediaContext.
@padenot talked about the WHATWG Streams which is natually offline. However, seems that it has no strong links to MediaStream and other media APIs now.
WHATWG Streams has been shipped partially in Chrome[1] and is under implementation in Gecko[2].
@mhofman suggested that separate the real-time and offline interfaces. For example, in the WebAudio, the offline-related interfaces use OfflineAudioContext and AudioBuffer and do not connect to any real-time specific objects such as HTMLMediaElement and MediaStream.
I think something likes VideoBuffer is not possible due to memory usage issue. One possibility is that, for the input part, connecting the video source URL directly to the Monitor/Processor without going through HTMLVideoElement and MediaStream so that we can decode and consume the video frames in a non-realtime way. For the output part, @mhofman talked about something likes VideoGenerator or utilizing the MediaRecorder API. I think #31 is also related to the outpat discussion.
I summary all I collected from TPAC here and look forward to thoughts!
[1] Intent to Ship: readable streams in Fetch API
[2] Bug 1128959 - Implement the WHATWG Streams spec
In issue #30, I mentioned we might be able to leverage the Canvas::captureMedia
feature to generate a MediaStreamTrack
after processing is done in an OffscreenCanvas
fed by the monitoring part of this spec.
The issue of keeping A/V sync was raised, since the captureMedia
API doesn't allow the input of frame timing.
I think it would be great if we could reconcile this feature gap and somehow avoid duplicated features for generating video tracks.
This is a meta issue to keep us aware of the fact that we should attempt to publish a First Public Working Draft (FPWD) of the spec in the near future, say by the end of June. FPWD is the first step in the formal Recommendation Track, has the following requirements from the W3C Process point of view:
I'll volunteer to fix the Pubrules Checker errors to make the document valid for publication, see:
WHATWG spec allows users to pass cropping area while creating an ImageBitmap
.
interface ImageBitmapFactories {
Promise<ImageBitmap> createImageBitmap(ImageBitmapSource image);
Promise<ImageBitmap> createImageBitmap(ImageBitmapSource image, long sx, long sy, long sw, long sh);
};
And for those pixels that are outside of the source image, fill them as transparent black. Spec.
If either sw or sh are negative, then the top-left corner of this rectangle will be to the left or above the (sx, sy) point. If any of the pixels on this rectangle are outside the area where the input bitmap was placed, then they will be transparent black in output.
This is reasonable since the ImageBitmap
is used to be drawn onto canvas. However, we are now extending the ImageBitmap
to be accessible in several kinds of format (via mapDataInto
method) and some formats do not support alpha-channel (for example, YUV444P), so we are not able to return "transparent black" in these formats.
@ChiahungTai proposes that we should throw if users call mapDataInto
on an ImageBitmap
which was created with cropping area that is outside the original source image. With this error message, users should then create another ImageBitmap
with a cropping area that is completely inside the source image.
@rocallahan, @anssiko and @smaug----, may I have your comments on this bug?
@jrmuizel mentioned that the current design of mapDataIntp()
method might lead to data racing code such as:
var bitmap = createImageBitmap(......);
var format = bitmap.findOptimalFormat();
var length = bitmap.mappedDataLength(format);
var buffer = new ArrayBuffer(length);
var p = bitmap.mapDataInto(format, buffer, 0, length); <----.
|
var view = new Int32TypedArray(buffer); | Data racing.
view[0] = 100; <--------------------------------------------'
@jrmuizel proposes that changing to use SharedArrayBuffer
so that the API itself explicitly says the data racing can happen.
Note that the mapDataInto()
is designed to cooperate with ASM.js applications so that the API can fill data into ASM.js' run-time heap. So, if we change to use SharedArrayBuffer
, the ASM.js application should also be compiled so that its run-time buffer is a SharedArrayBuffer
too.
Personally, I think this is a good suggestion.
@rocallahan, @anssiko and @smaug----, may I have your comments on this issue?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.