GithubHelp home page GithubHelp logo

mediacapture-worker's People

Contributors

anssiko avatar chiahungtai avatar dontcallmedom avatar kakukogou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mediacapture-worker's Issues

Should there be an onvideomonitor event handler too?

The convention is that each event type comes with its own event handler. I think we should add onvideomonitor:

partial interface DedicatedWorkerGlobalScope {
                attribute EventHandler onvideomonitor;
                attribute EventHandler onvideoprocess;
};

This is the usual boilerplate specifications tend to use, adapted to this spec:

The following are the event handlers (and their corresponding event handler event types) that must be supported, as event handler IDL attributes, by objects implementing the DedicatedWorkerGlobalScope interface:

Event handler Event handler event type
onvideomonitor videomonitor
onvideoprocess videoprocess

Support asynchronous processing in the VideoProcessor case?

@ChiahungTai pointed out a problem in the sample code that, in the VideoProcessor case, the VideoProcessorEvent::outputImageBitmap will be set at the promise callback of createImageBitmap() method which means that while the VideoProcessorEvent::outputImageBitmap is being set, the VideoProcessorEvent has already been returned and the framework/User Agent got a NULL.

Looking into the counterpart part of WebAudio, the AudioWorker. The onaudioprocess event is restricted to do processing synchronously.

The restriction of synchronous processing leads to simpler APIs and framework design, however, since we are going to enable processing on the main thread #30, is such restriction still suitable? Also, if we are going to limit the processing to be synchronous, we need to redesign the ImageBitmapFactories::createImageBitmap() to a synchronous flavor.

On the other hand, if we allow asynchronous processing, then we need a mechanism to let the browser know that the VideoProcessorEvent::outputImageBitmap is ready for rendering and before that, the browser should not go further.

@rocallahan, @mhofman, @anssiko, may I know your opinions?

Add constructors to both ChannelPixelLayout and ImageFormatPixelLayout

Before I can proceed #23, I would like to introduce constructors to both ChannelPixelLayout and ImageFormatPixelLayout. Because it is very likely that the formats of VideoProcessorEvent::inputImageBitmap and VideoProcessorEvent::outputImageBitmap are different. So, if users have to use ImageBitmapFactories::createImageBitmap(FromSourceBuffer), they need a way to create ChannelPixelLayout and ImageFormatPixelLayout objects. Thoughts? @anssiko and @ChiahungTai.

Consider leveraging Worklet to decouple monitor/processor from WebWorker.

In issue #30, @mhofman proposed to decouple monitor/processor from Worker and this issue was discussed in the TPAC2015, mediacapture-worker Ad-hoc meeting.

During the meeting @padenot mentioned that WebAudio is considering to adopting Worklet (a.k.a. IsolatedWorker and Processor) to re-design the AudioWorker and he suggested the video processing also considering it.

To my understanding, correct me if I was wrong, CSS-Worklet and WebAudio shares the same requirements[1][2][3] that are:
(1) Light weight. The WebWorker is too heavy to create and initialize.
(2) Thread-agnostic. The processing script should be run on any thread.
(3) Clean/Safe API surface. To prevent dangerous API call, e.q. setInterval().
(4) Hooked-based callbacks. Not event-based, the processing script is not ran continuously.

For the WebAudio, there is also another requirement that (5) all the node of the same AudioContext should be ran on the same thread which prevents them to implement the AudioWorker by WebAudio.[3]

In issue #30, our target is (2) and other items, from my personal point of view, seem to be not necessary to our scenario, thoughts?

[1] TPAC2015-WebAudio-meeting-minutes-day-1
[2] WebAudio mailing list - New name for "AudioWorker"
[3] WebAudio/web-audio-api#532

Crop YUV422/YUV420 data?

It is not trivial how to crop YUV422/YUV420 data if the cropping area starts with odd x or y coordinate.
For example:

// Give an YUV420P data with the following pixel layout.
.-------------------.-------------------.-------------------.-------------------.
| y(0, 0)           | y(1, 0)           | y(2, 0)           | y(3, 0)           |
| u(0, 0), v(0, 0)  |                   | u(1, 0), v(1, 0)  |                   |
+-------------------+-------------------+-------------------+-------------------+
| y(0, 1)           | y(1, 1)           | y(2, 1)           | y(3, 1)           |
|                   |                   |                   |                   |
+-------------------+-------------------+-------------------+-------------------+
| y(0, 2)           | y(1, 2)           | y(2, 2)           | y(3, 2)           |
| u(0, 1), v(0, 1)  |                   | u(1, 1), v(1, 1)  |                   |
+-------------------+-------------------+-------------------+-------------------+
| y(0, 3)           | y(1, 3)           | y(2, 3)           | y(3, 3)           |
|                   |                   |                   |                   |
+-------------------+-------------------+-------------------+-------------------+

// What if developers call createImageBitmap with the above data 
// and also pass a cropping rectangle starts at a odd x or y coordinate?
var p = createImageBitmap(yuv420PBuffer, ......, 1, 1, 2, 2);

Cropping u/v-channel starts from odd x or y coordinate is not trivial, here are two proposals:

(1) Via re-sampling.
We can first up-sample the u/v-channel to the same size of y-channel, do the cropping, and then down-sample aging back to the u/v-channel cropping size. However, by this way, the newly created ImageBitmap's data is not the one passed by the caller. Also, there are plenty of re-sampling methods, should we explicitly define one on the spec?

(2) Avoid it.
We can specifically define that only cropping rectangle starts form odd x and y coordinates are allowed for YUV422P/YUV420P/NV12/NV21 format.

I personally prefer the 2nd one, thoughts? @jrmuizel @rocallahan, @ChiahungTai and @anssiko.

Change the VideoProcessorEvent.outputImageBitmap to VideoProcessorEvent.setOutputImageBitmap()

@ehsan suggested that we should change the VideoProcessorEvent.outputImageBitmap property to VideoProcessorEvent.setOutputImageBitmap(Promise<ImageBitmap> outputIB), which is more clear naming for the developers to follow.

So, developers use this api in the following way:

var processor = new VideoProcessor();
processor.onvideoprocess = (event) => {
    /* do some processing */
    event.setOutputImageBitmap(createImageBitmap( /* from the processed array buffer */);
}

Elaborate the backpressure handling

We have a Note at the end of section 6 about the UA could skip frame if the users' scripts cannot consume frames in real-time. However, we should describe this mechanism in more detail.

Example 12 uses an undefined setDataFrom() method

Example 12 uses setDataFrom() method which is not defined in the spec. I believe this should be mapDataInto() instead, correct @kakukogou?

// write back to event outputImageBitmap
  outputBitmap.setDataFrom("RGBA32", rgbaBuffer, 0, rgbaBufferLength,
                           bitmap.width, bitmap.height, bitmapPixelLayout.channels[0].stride);
}

Decouple VideoProcessor from Workers

I'd like us to consider having a self contained VideoProcessor/Monitor object that's not tied to a Worker object.

The idea is to have a processing model similar to OffscrenCanvas, where you have an object you can use either on the main thread, or in a worker context.

Events would be dispatched on this VideoProcessor/Monitor object instead of extending the Worker interface. The VideoProcessor/Monitor would be Transferable to Workers using postMessage.
Those objects would be created using a createVideoProcessor/Monitor method on MediaStreamTracks. You can release them using a close method.

We'd need to figure out a way to get the output MediaStreamTrack of a VideoProcessor, but that's a little detail.

[UseCases] Re-sample the video frame rate.

@mhofman mentioned this use case in the TPAC and, indeed, we are not able to cover this uses case now.

On second thought, I think this is a possible use case in offline processing but not sure if it is needed in realtime processing.

Scenario:
Users shoot a 120FPS slow motion video and would like to re-compile it into a normal 30FPS video.

Drop the cropping variant of creating ImageBitmap from an ArrayBuffer.

Following the discussion at Mozilla bugzilla bug1141979 comment278.

@jrmuizel suggests that dropping the cropping variant of ImageBitmapFactory extensions:

[NoInterfaceObject, Exposed=(Window,Worker)]
partial interface ImageBitmapFactories {
    // Keep this.
    Promise<ImageBitmap> createImageBitmap(BufferSource buffer,
                                           long offset,
                                           long length,
                                           ImageFormat format,
                                           ImageFormatPixelLayout layout);

    // Drop this.
    Promise<ImageBitmap> createImageBitmap(BufferSource buffer,
                                           long offset,
                                           long length,
                                           ImageFormat format,
                                           ImageFormatPixelLayout layout,
                                           long sx,
                                           long sy,
                                           long sw,
                                           long sh);
};

Before, the reason why I chose to keep the copping variant is to keep the same behavior of the existing API. However, after the implementation and reviewing, we have already filed two issues, #43 and #46, to add exceptions on it. So, the extension API now has different behaviors to the existing API.

Personally, I agree to drop the cropping variant to simplification, however I still would like to listen to more thoughts on it. @rocallahan, @anssiko, @ChiahungTai and @smaug----.

Processing offline context.

This issue was mentioned in #30 and also discussed in TPAC2015.

Before that, we do have some preliminary ideas about this issue, which is going to extend the MediaStream with offline property, pleases refer to OfflineMediaContext.

@padenot talked about the WHATWG Streams which is natually offline. However, seems that it has no strong links to MediaStream and other media APIs now.
WHATWG Streams has been shipped partially in Chrome[1] and is under implementation in Gecko[2].

@mhofman suggested that separate the real-time and offline interfaces. For example, in the WebAudio, the offline-related interfaces use OfflineAudioContext and AudioBuffer and do not connect to any real-time specific objects such as HTMLMediaElement and MediaStream.
I think something likes VideoBuffer is not possible due to memory usage issue. One possibility is that, for the input part, connecting the video source URL directly to the Monitor/Processor without going through HTMLVideoElement and MediaStream so that we can decode and consume the video frames in a non-realtime way. For the output part, @mhofman talked about something likes VideoGenerator or utilizing the MediaRecorder API. I think #31 is also related to the outpat discussion.

I summary all I collected from TPAC here and look forward to thoughts!

[1] Intent to Ship: readable streams in Fetch API
[2] Bug 1128959 - Implement the WHATWG Streams spec

Consider leveraging `Canvas::captureMedia` for generating content

In issue #30, I mentioned we might be able to leverage the Canvas::captureMedia feature to generate a MediaStreamTrack after processing is done in an OffscreenCanvas fed by the monitoring part of this spec.

The issue of keeping A/V sync was raised, since the captureMedia API doesn't allow the input of frame timing.

I think it would be great if we could reconcile this feature gap and somehow avoid duplicated features for generating video tracks.

Publish First Public Working Draft

This is a meta issue to keep us aware of the fact that we should attempt to publish a First Public Working Draft (FPWD) of the spec in the near future, say by the end of June. FPWD is the first step in the formal Recommendation Track, has the following requirements from the W3C Process point of view:

  • must record the group's decision to request advancement.
    • Chairs @stefhak and/or @alvestrand to send Call for Consensus to publish a FPWD to the group's mailing list
  • must obtain Director approval.
    • The Chairs (or Team Contact) sends a transition request to the Domain Lead(s) responsible for the group(s) publishing the document.
  • must provide public documentation of all substantive changes to the technical report since the previous publication.
    • (not relevant, since this is the first publication)
  • must formally address all issues raised about the document since the previous maturity level.
    • (not relevant, since this is the first publication)
  • must provide public documentation of any Formal Objections.
  • should provide public documentation of changes that are not substantive.
    • (not relevant, since this is the first publication)
  • should report which, if any, of the Working Group's requirements for this document have changed since the previous step.
    • (not relevant, since this is the first publication)
  • should report any changes in dependencies with other groups.
    • (not relevant, since this is the first publication)
  • should provide information about implementations known to the Working Group.

I'll volunteer to fix the Pubrules Checker errors to make the document valid for publication, see:

Call mapDataInto() on an ImageBitmap which was created with a cropping area that is outside of the source image.

WHATWG spec allows users to pass cropping area while creating an ImageBitmap.

interface ImageBitmapFactories {
  Promise<ImageBitmap> createImageBitmap(ImageBitmapSource image);
  Promise<ImageBitmap> createImageBitmap(ImageBitmapSource image, long sx, long sy, long sw, long sh);
};

And for those pixels that are outside of the source image, fill them as transparent black. Spec.

If either sw or sh are negative, then the top-left corner of this rectangle will be to the left or above the (sx, sy) point. If any of the pixels on this rectangle are outside the area where the input bitmap was placed, then they will be transparent black in output.

This is reasonable since the ImageBitmap is used to be drawn onto canvas. However, we are now extending the ImageBitmap to be accessible in several kinds of format (via mapDataInto method) and some formats do not support alpha-channel (for example, YUV444P), so we are not able to return "transparent black" in these formats.

@ChiahungTai proposes that we should throw if users call mapDataInto on an ImageBitmap which was created with cropping area that is outside the original source image. With this error message, users should then create another ImageBitmap with a cropping area that is completely inside the source image.

@rocallahan, @anssiko and @smaug----, may I have your comments on this bug?

Use SharedArrayBuffer as the parameter type of mapDataInto() to indicate that data racing might happen.

@jrmuizel mentioned that the current design of mapDataIntp() method might lead to data racing code such as:

var bitmap = createImageBitmap(......);
var format = bitmap.findOptimalFormat();
var length = bitmap.mappedDataLength(format);
var buffer = new ArrayBuffer(length);
var p = bitmap.mapDataInto(format, buffer, 0, length);  <----.
                                                             | 
var view = new Int32TypedArray(buffer);                      | Data racing.
view[0] = 100;  <--------------------------------------------'

@jrmuizel proposes that changing to use SharedArrayBuffer so that the API itself explicitly says the data racing can happen.

Note that the mapDataInto() is designed to cooperate with ASM.js applications so that the API can fill data into ASM.js' run-time heap. So, if we change to use SharedArrayBuffer, the ASM.js application should also be compiled so that its run-time buffer is a SharedArrayBuffer too.

Personally, I think this is a good suggestion.
@rocallahan, @anssiko and @smaug----, may I have your comments on this issue?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.