w3c / mediacapture-worker Goto Github PK

View Code? Open in Web Editor NEW

9.0 9.0 12.0 195 KB

MediaStream with worker

License: Other

HTML 100.00%

mediacapture-worker's People

Contributors

Stargazers

Watchers

Forkers

anssiko kakukogou kleopatra999 robman bhanditz guest271314 isabella232

mediacapture-worker's Issues

Do VideoMonitorEvent and VideoProcessorEvent need a constructor?

@ChiahungTai mentioned this issue and open for discussion here.

Remove the "=null" of the VideoProcessorEvent::outputImageBitmap property.

This property should not have a default value.

Make addVideoMonitor/Processor and removeVideoMonitor/Processor return promises

As suggested by @ChiahungTai in #19 (comment), addVideoMonitor/Processor and removeVideoMonitor/Processor methods should return promises.

Merge imagebitmap-extesion into this specification.

Merge ImageBitmap-extension into this specification.

Should there be an onvideomonitor event handler too?

The convention is that each event type comes with its own event handler. I think we should add onvideomonitor:

partial interface DedicatedWorkerGlobalScope {
                attribute EventHandler onvideomonitor;
                attribute EventHandler onvideoprocess;
};

This is the usual boilerplate specifications tend to use, adapted to this spec:

The following are the event handlers (and their corresponding event handler event types) that must be supported, as event handler IDL attributes, by objects implementing the DedicatedWorkerGlobalScope interface:

Event handler Event handler event type

onvideomonitor videomonitor

onvideoprocess videoprocess

Event handler	Event handler event type
onvideomonitor	videomonitor
onvideoprocess	videoprocess

The data type of HSV and Lab should be floating point

The definition of HSV and Lab are usually defined in floating point domain, I will then change the data type of HSV and Lab form uint8 to float32.

Simplify the ChannelPixelLayout and ImageFormatPixelLayout

@rocallahan suggested that rename the ImageFormatPixelLayout to ImagePixelLayout because the original one is too redundant.

@smaug---- suggested using WebIDL dictionary for ChannelPixelLayout and then the Image(Format)PixelLayout becomes just sequence<ChannelPixelLayout>. (maybe using a typedef)

VideoProcessorEvent should not redefine properties of the VideoMonitorEvent.

Since the VideoProcessorEvent inherits VideoMonitorEvent, VideoProcessorEvent shouldn't
redefine the properties in VideoMonitorEvent.

Support asynchronous processing in the VideoProcessor case?

@ChiahungTai pointed out a problem in the sample code that, in the VideoProcessor case, the VideoProcessorEvent::outputImageBitmap will be set at the promise callback of createImageBitmap() method which means that while the VideoProcessorEvent::outputImageBitmap is being set, the VideoProcessorEvent has already been returned and the framework/User Agent got a NULL.

Looking into the counterpart part of WebAudio, the AudioWorker. The onaudioprocess event is restricted to do processing synchronously.

The restriction of synchronous processing leads to simpler APIs and framework design, however, since we are going to enable processing on the main thread #30, is such restriction still suitable? Also, if we are going to limit the processing to be synchronous, we need to redesign the ImageBitmapFactories::createImageBitmap() to a synchronous flavor.

On the other hand, if we allow asynchronous processing, then we need a mechanism to let the browser know that the VideoProcessorEvent::outputImageBitmap is ready for rendering and before that, the browser should not go further.

@rocallahan, @mhofman, @anssiko, may I know your opinions?

ImageBitmapFactory::createImageBitmap(FromBufferSource) should throw if the given format is not supported.

As title, the ImageBitmapFactory::createImageBitmap(FromBufferSource) method should throw if the given format is not supported by the user agent. Just like the mechanism of ImageBitmap::mapDataInto().

Add constructors to both ChannelPixelLayout and ImageFormatPixelLayout

Before I can proceed #23, I would like to introduce constructors to both ChannelPixelLayout and ImageFormatPixelLayout. Because it is very likely that the formats of VideoProcessorEvent::inputImageBitmap and VideoProcessorEvent::outputImageBitmap are different. So, if users have to use ImageBitmapFactories::createImageBitmap(FromSourceBuffer), they need a way to create ChannelPixelLayout and ImageFormatPixelLayout objects. Thoughts? @anssiko and @ChiahungTai.

Consider leveraging Worklet to decouple monitor/processor from WebWorker.

In issue #30, @mhofman proposed to decouple monitor/processor from Worker and this issue was discussed in the TPAC2015, mediacapture-worker Ad-hoc meeting.

During the meeting @padenot mentioned that WebAudio is considering to adopting Worklet (a.k.a. IsolatedWorker and Processor) to re-design the AudioWorker and he suggested the video processing also considering it.

To my understanding, correct me if I was wrong, CSS-Worklet and WebAudio shares the same requirements[1][2][3] that are:
(1) Light weight. The WebWorker is too heavy to create and initialize.
(2) Thread-agnostic. The processing script should be run on any thread.
(3) Clean/Safe API surface. To prevent dangerous API call, e.q. setInterval().
(4) Hooked-based callbacks. Not event-based, the processing script is not ran continuously.

For the WebAudio, there is also another requirement that (5) all the node of the same AudioContext should be ran on the same thread which prevents them to implement the AudioWorker by WebAudio.[3]

In issue #30, our target is (2) and other items, from my personal point of view, seem to be not necessary to our scenario, thoughts?

[1] TPAC2015-WebAudio-meeting-minutes-day-1
[2] WebAudio mailing list - New name for "AudioWorker"
[3] WebAudio/web-audio-api#532

Crop YUV422/YUV420 data?

It is not trivial how to crop YUV422/YUV420 data if the cropping area starts with odd x or y coordinate.
For example:

// Give an YUV420P data with the following pixel layout.
.-------------------.-------------------.-------------------.-------------------.
| y(0, 0)           | y(1, 0)           | y(2, 0)           | y(3, 0)           |
| u(0, 0), v(0, 0)  |                   | u(1, 0), v(1, 0)  |                   |
+-------------------+-------------------+-------------------+-------------------+
| y(0, 1)           | y(1, 1)           | y(2, 1)           | y(3, 1)           |
|                   |                   |                   |                   |
+-------------------+-------------------+-------------------+-------------------+
| y(0, 2)           | y(1, 2)           | y(2, 2)           | y(3, 2)           |
| u(0, 1), v(0, 1)  |                   | u(1, 1), v(1, 1)  |                   |
+-------------------+-------------------+-------------------+-------------------+
| y(0, 3)           | y(1, 3)           | y(2, 3)           | y(3, 3)           |
|                   |                   |                   |                   |
+-------------------+-------------------+-------------------+-------------------+

// What if developers call createImageBitmap with the above data 
// and also pass a cropping rectangle starts at a odd x or y coordinate?
var p = createImageBitmap(yuv420PBuffer, ......, 1, 1, 2, 2);

Cropping u/v-channel starts from odd x or y coordinate is not trivial, here are two proposals:

(1) Via re-sampling.
We can first up-sample the u/v-channel to the same size of y-channel, do the cropping, and then down-sample aging back to the u/v-channel cropping size. However, by this way, the newly created ImageBitmap's data is not the one passed by the caller. Also, there are plenty of re-sampling methods, should we explicitly define one on the spec?

(2) Avoid it.
We can specifically define that only cropping rectangle starts form odd x and y coordinates are allowed for YUV422P/YUV420P/NV12/NV21 format.

I personally prefer the 2nd one, thoughts? @jrmuizel @rocallahan, @ChiahungTai and @anssiko.

Write a "hello world" like example.

The current example is not straightforward enough to express the main concept of this spec., we need a clear one.

The name of enum DataType is too generic

@smaug---- suggested that this enumeration should be called ChannelPixelLayoutDataType. The original name DataType is just too generic.

Add constructors to both VideoMonitorEvent and VideoProcessorEvent.

The evens need a constructor.

Expose both VideoMonitorEvent and VideoProcessorEvent to worker only.

Since both VideoMonitorEvent and VideoProcessorEvent are dispatched to worker only, we should add [Exposed=worker] to them.

Change the VideoProcessorEvent.outputImageBitmap to VideoProcessorEvent.setOutputImageBitmap()

@ehsan suggested that we should change the VideoProcessorEvent.outputImageBitmap property to VideoProcessorEvent.setOutputImageBitmap(Promise<ImageBitmap> outputIB), which is more clear naming for the developers to follow.

So, developers use this api in the following way:

var processor = new VideoProcessor();
processor.onvideoprocess = (event) => {
    /* do some processing */
    event.setOutputImageBitmap(createImageBitmap( /* from the processed array buffer */);
}

Elaborate the backpressure handling

We have a Note at the end of section 6 about the UA could skip frame if the users' scripts cannot consume frames in real-time. However, we should describe this mechanism in more detail.

Define the processing model

We should add a processing model section similar to http://padenot.github.io/web-audio-api/#processing-model (currently an early draft by @padenot). Consider at least threads, message passing, event loops, asynchronous operations. I think we share much of the common ground with the Web Audio API processing model, and should reuse and adapt the model and terminology for this spec when more settled.

Example 12 uses an undefined setDataFrom() method

Example 12 uses setDataFrom() method which is not defined in the spec. I believe this should be mapDataInto() instead, correct @kakukogou?

// write back to event outputImageBitmap
  outputBitmap.setDataFrom("RGBA32", rgbaBuffer, 0, rgbaBufferLength,
                           bitmap.width, bitmap.height, bitmapPixelLayout.channels[0].stride);
}

Decouple VideoProcessor from Workers

I'd like us to consider having a self contained VideoProcessor/Monitor object that's not tied to a Worker object.

The idea is to have a processing model similar to OffscrenCanvas, where you have an object you can use either on the main thread, or in a worker context.

Events would be dispatched on this VideoProcessor/Monitor object instead of extending the Worker interface. The VideoProcessor/Monitor would be Transferable to Workers using postMessage.
Those objects would be created using a createVideoProcessor/Monitor method on MediaStreamTracks. You can release them using a close method.

We'd need to figure out a way to get the output MediaStreamTrack of a VideoProcessor, but that's a little detail.

The ImageBitmap::FindOptimalFormat() should not return empty string

@rocallahan suggested that the ImageBitmap::FindOptimalFormat() should not return an empty string. It should always be convertible between any format except to/from DEPTH and we could throw while converting to/from the DEPTH format.

Switch to Contiguous IDL

The use of Contiguous IDL will improve the readability of the spec, and makes spec editing easier.

[UseCases] Re-sample the video frame rate.

@mhofman mentioned this use case in the TPAC and, indeed, we are not able to cover this uses case now.

On second thought, I think this is a possible use case in offline processing but not sure if it is needed in realtime processing.

Scenario:
Users shoot a 120FPS slow motion video and would like to re-compile it into a normal 30FPS video.

Drop the cropping variant of creating ImageBitmap from an ArrayBuffer.

Following the discussion at Mozilla bugzilla bug1141979 comment278.

@jrmuizel suggests that dropping the cropping variant of ImageBitmapFactory extensions:

[NoInterfaceObject, Exposed=(Window,Worker)]
partial interface ImageBitmapFactories {
    // Keep this.
    Promise<ImageBitmap> createImageBitmap(BufferSource buffer,
                                           long offset,
                                           long length,
                                           ImageFormat format,
                                           ImageFormatPixelLayout layout);

    // Drop this.
    Promise<ImageBitmap> createImageBitmap(BufferSource buffer,
                                           long offset,
                                           long length,
                                           ImageFormat format,
                                           ImageFormatPixelLayout layout,
                                           long sx,
                                           long sy,
                                           long sw,
                                           long sh);
};

Before, the reason why I chose to keep the copping variant is to keep the same behavior of the existing API. However, after the implementation and reviewing, we have already filed two issues, #43 and #46, to add exceptions on it. So, the extension API now has different behaviors to the existing API.

Personally, I agree to drop the cropping variant to simplification, however I still would like to listen to more thoughts on it. @rocallahan, @anssiko, @ChiahungTai and @smaug----.

Security Considerations

We should not process data from MediaStreams which come from different origin.

Processing offline context.

This issue was mentioned in #30 and also discussed in TPAC2015.

Before that, we do have some preliminary ideas about this issue, which is going to extend the MediaStream with offline property, pleases refer to OfflineMediaContext.

@padenot talked about the WHATWG Streams which is natually offline. However, seems that it has no strong links to MediaStream and other media APIs now.
WHATWG Streams has been shipped partially in Chrome[1] and is under implementation in Gecko[2].

@mhofman suggested that separate the real-time and offline interfaces. For example, in the WebAudio, the offline-related interfaces use OfflineAudioContext and AudioBuffer and do not connect to any real-time specific objects such as HTMLMediaElement and MediaStream.
I think something likes VideoBuffer is not possible due to memory usage issue. One possibility is that, for the input part, connecting the video source URL directly to the Monitor/Processor without going through HTMLVideoElement and MediaStream so that we can decode and consume the video frames in a non-realtime way. For the output part, @mhofman talked about something likes VideoGenerator or utilizing the MediaRecorder API. I think #31 is also related to the outpat discussion.

I summary all I collected from TPAC here and look forward to thoughts!

[1] Intent to Ship: readable streams in Fetch API
[2] Bug 1128959 - Implement the WHATWG Streams spec

Consider leveraging `Canvas::captureMedia` for generating content

In issue #30, I mentioned we might be able to leverage the Canvas::captureMedia feature to generate a MediaStreamTrack after processing is done in an OffscreenCanvas fed by the monitoring part of this spec.

The issue of keeping A/V sync was raised, since the captureMedia API doesn't allow the input of frame timing.

I think it would be great if we could reconcile this feature gap and somehow avoid duplicated features for generating video tracks.

Publish First Public Working Draft

This is a meta issue to keep us aware of the fact that we should attempt to publish a First Public Working Draft (FPWD) of the spec in the near future, say by the end of June. FPWD is the first step in the formal Recommendation Track, has the following requirements from the W3C Process point of view:

must record the group's decision to request advancement.
- Chairs @stefhak and/or @alvestrand to send Call for Consensus to publish a FPWD to the group's mailing list
must obtain Director approval.
- The Chairs (or Team Contact) sends a transition request to the Domain Lead(s) responsible for the group(s) publishing the document.
must provide public documentation of all substantive changes to the technical report since the previous publication.
- (not relevant, since this is the first publication)
must formally address all issues raised about the document since the previous maturity level.
- (not relevant, since this is the first publication)
must provide public documentation of any Formal Objections.
should provide public documentation of changes that are not substantive.
- (not relevant, since this is the first publication)
should report which, if any, of the Working Group's requirements for this document have changed since the previous step.
- (not relevant, since this is the first publication)
should report any changes in dependencies with other groups.
- (not relevant, since this is the first publication)
should provide information about implementations known to the Working Group.

I'll volunteer to fix the Pubrules Checker errors to make the document valid for publication, see:

Call mapDataInto() on an ImageBitmap which was created with a cropping area that is outside of the source image.

WHATWG spec allows users to pass cropping area while creating an ImageBitmap.

interface ImageBitmapFactories {
  Promise<ImageBitmap> createImageBitmap(ImageBitmapSource image);
  Promise<ImageBitmap> createImageBitmap(ImageBitmapSource image, long sx, long sy, long sw, long sh);
};

And for those pixels that are outside of the source image, fill them as transparent black. Spec.

If either sw or sh are negative, then the top-left corner of this rectangle will be to the left or above the (sx, sy) point. If any of the pixels on this rectangle are outside the area where the input bitmap was placed, then they will be transparent black in output.

This is reasonable since the ImageBitmap is used to be drawn onto canvas. However, we are now extending the ImageBitmap to be accessible in several kinds of format (via mapDataInto method) and some formats do not support alpha-channel (for example, YUV444P), so we are not able to return "transparent black" in these formats.

@ChiahungTai proposes that we should throw if users call mapDataInto on an ImageBitmap which was created with cropping area that is outside the original source image. With this error message, users should then create another ImageBitmap with a cropping area that is completely inside the source image.

@rocallahan, @anssiko and @smaug----, may I have your comments on this bug?

Use SharedArrayBuffer as the parameter type of mapDataInto() to indicate that data racing might happen.

@jrmuizel mentioned that the current design of mapDataIntp() method might lead to data racing code such as:

var bitmap = createImageBitmap(......);
var format = bitmap.findOptimalFormat();
var length = bitmap.mappedDataLength(format);
var buffer = new ArrayBuffer(length);
var p = bitmap.mapDataInto(format, buffer, 0, length);  <----.
                                                             | 
var view = new Int32TypedArray(buffer);                      | Data racing.
view[0] = 100;  <--------------------------------------------'

@jrmuizel proposes that changing to use SharedArrayBuffer so that the API itself explicitly says the data racing can happen.

Note that the mapDataInto() is designed to cooperate with ASM.js applications so that the API can fill data into ASM.js' run-time heap. So, if we change to use SharedArrayBuffer, the ASM.js application should also be compiled so that its run-time buffer is a SharedArrayBuffer too.

Personally, I think this is a good suggestion.
@rocallahan, @anssiko and @smaug----, may I have your comments on this issue?

w3c / mediacapture-worker Goto Github PK

mediacapture-worker's People

Contributors

Stargazers

Watchers

Forkers

mediacapture-worker's Issues

Recommend Projects

Recommend Topics

Recommend Org

Jobs