GithubHelp home page GithubHelp logo

Decouple from webrtc stuff about mediadevices HOT 27 CLOSED

pion avatar pion commented on August 19, 2024
Decouple from webrtc stuff

from mediadevices.

Comments (27)

lherman-cs avatar lherman-cs commented on August 19, 2024

I think we should instead use sources and sinks concept, https://w3c.github.io/mediacapture-main/#the-model-sources-sinks-constraints-and-settings. Then, webrtc and rtp can be the sinks.

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

I've created a small design draft, https://github.com/pion/mediadevices/wiki/Source-Sink-Design.

@at-wat do you mind to take a look?

from mediadevices.

at-wat avatar at-wat commented on August 19, 2024

Putting GetUserMedia directly under mediadevices package sounds reasonable.

I'm not very sure who have to own video/audio encoder.
In the new design draft, PeerConnection seems having encoder, but I think encoding is out of PeerConnection's role.
We may want to put extwebrtc features to pion/webrtc/v3, so PeerConnection is better to be separated from media encoding things?

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

I'm definitely 100% with you. The encoding part has been really awkward. I'm not really sure either where to put it...

If we put it alongside GetUserMedia, it would feel weird since GetUserMedia is supposed to get a source based on the constraint and the result is supposed to be transformable (if we already encoded, I don't know how users can transform it)

On the other hand, if we put it in ext/webrtc, users can easily transform the source from GetUserMedia and GetUserMedia feels less awkward, and we have all the information necessary to pick the right codec from PeerConnection. But, like you said, this will make it really hard to merge it to main webrtc package, and it's probably not PeerConnection's responsibility.

So, I think the main problem is where to put the logic to decide which codec should be used.

Maybe, we have a middle approach? Not in GetUserMedia and not in PeerConnection, but a new API?

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

After reading Chromium codebase, it looks like they put the encoder factories in their mediaengine, https://chromium.googlesource.com/external/webrtc/+/refs/heads/master/media/engine/webrtc_media_engine.h, which I think it should be similar to Pion's mediaengine.

The main difference is that Chromium's media engine create function accepts a MediaEngineDependencies which also has video/audio encoder factories. So, maybe we should try to add a new parameter in our media engine to do the same? (For now, we can probably play around in ext/webrtc)

I will update the design draft to see if it feels good.

from mediadevices.

at-wat avatar at-wat commented on August 19, 2024

Sounds good.

So, if we have them in MediaEngine, RTPCodec may have interfaces of video/audio encoder/decoder and Track.ReadBlob/WriteBlob (Blob may be better to be replaced by Frame or something?) can be used if encoder/decoder is specified.

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

I'm not sure which RTPCodec interface that you were referring, did you mean the interfaces from V3, https://github.com/pion/webrtc/wiki/PlanningV3?

Yea, I thought having Blob concept too, at the end it became a simple writer/reader but with some meta data. Not sure, what kind of interfaces should we pass to the encoder factory/builder. The thing is the main webrtc package needs to understand frames (for video) and chunks (for audio) so that it can use the encoder to encode the frames and chunks from tracks.

Now, the question is, should the webrtc package depends on mediadevices package? Or the other way around? Or one of them defines its own interface and the other package will implement the interfaces implicitly? (Similar to packages that implements io.Reader and io.Writer implicitly)

The problem with the last approach is that the interface should only use primitive data types, otherwise there will be a dependency on the data structure itself (this itself would forbid us to not have a direct dependency).

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

@at-wat just updated the draft again. PTAL ^^

In the new draft, I refined it so that it gets the encoder builder from the media engine instead. Also, I changed the encoder builder interface to accept a Track interface instead and spits out a LocalTrack interface.

The main idea is so that the main webrtc doesn't need to know what kind of format that's coming from Track. Since Track and EncoderBuilder interfaces are from the users (which is mediadevices in our case), the users will have full control over how to handle their own data.

from mediadevices.

at-wat avatar at-wat commented on August 19, 2024

The draft looks nice.
Maybe mediadevices.Track will have ReadBlob() and wrapped by encoder to provide ReadRTP()?

type Track interface {
  ID() string
  Stop()
  OnEnded(func(error))
  ReadBlob() ([]byte, error) // or
  ReadBlob([]byte) (int, error)
}

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

That's definitely possible. But, I'm not sure about the use case of having ReadBlob in Track. In my head, people would probably use the decoded version which they can get by casting Track to either VideoTrack or AudioTrack, or they would get the encoded version which they woud get from the encoder.

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

@at-wat I just created a POC here, https://github.com/pion/mediadevices/blob/refractor/examples/simple/main.go. Please feel free to drop feedback!

The example basically reads frames from the camera, generates MJPEGs, and serves them through HTTP. The port is 1313.

I think it really shows:

  1. It's a lot easier to use
  2. It can be used for non-WebRTC
  3. It's pretty easy to broadcast

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

It seems that there's a lot of work that needs to be done and remove a lot of things. So, I'll create a new branch called redesign.

from mediadevices.

at-wat avatar at-wat commented on August 19, 2024

Sorry, I just saw https://github.com/pion/mediadevices/blob/refractor/examples/simple/main.go.
Seems very nice!

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

Glad you like it too!

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

I have another commit to add the webrtc extension, 21fd8c9. I also add a new webrtc example with the new design.

It definitely still has some rough edges, especially about the cleanup for the rtp reader. Please let me know what you think. Also, feel free to submit a PR to propose a change to that branch. I'll be very happy to look.

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

@at-wat, I've updated the design doc. Now, the design covers the common use case for Pion where tracks always give encoded version of media data. While covering this, I also made sure that the encoding part is not a part of track.

So, for WebRTC use cases, it'll look like the following:

package main

import (
       "github.com/pion/webrtc/v2"
       "github.com/pion/mediadevices"
       "github.com/pion/mediadevices/pkg/codec/x264"
)

func main() {
       ...
       s, err := mediadevices.GetUserMedia(mediadevices.MediaStreamConstraints{
		Audio: func(c *mediadevices.MediaTrackConstraints) {},
		Video: func(c *mediadevices.MediaTrackConstraints) {
			c.Width = 640
			c.Height = 480
		},
	})

        x264Params, _ := x264.NewParams()
        enhancedPC := EnhancePeerConnection(pc, &x264Params)

        for _, tracker := range s.GetTracks() {
		_, err = enhancedPC.EnhancedAddTransceiverFromTrack(tracker),
 			webrtc.RtpTransceiverInit{
				Direction: webrtc.RTPTransceiverDirectionSendonly,
			},
		)
	}
}

As you can see here, we can add our methods to PeerConnection by prefixing them with "Enhanced". This way, we can bring our own data structure while still providing clean APIs.

What do you think?

from mediadevices.

at-wat avatar at-wat commented on August 19, 2024

https://github.com/pion/mediadevices/wiki/Source-Sink-Design#design

// step 1: find common supported codecs from pc.opt and mediaengine
// note 1.1: if there are multiple codecs as the result, try to build in sequential order, if one fail, use the next ones.

// step 2: create a local track using the encoder builder

// step 2: create a new RTPSender

I'm not very sure that we can create a new RTPSender outside webrtc package. RTPSender requires SRTP master key which is from DTLS session.

// step 3: replace the RTPSender's local track from step 3 with the local track from step 2

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

I think it's possible. I'm thinking to just call AddTransceiverFromTrack which should give us an indirect access to the dtls transport. (When I said to create RTPSender, it's not through NewRTPSender but through AddTransceiverFromTrack)

When v3 is ready, the track will become an interface. But, this is okay, since we can provide a proxy implementation.

I think the idea here is only to hide the codec matching and encoding, but the rest will still use existing APIs since they can handle encoded data well already.

from mediadevices.

at-wat avatar at-wat commented on August 19, 2024

I feel that if it can be implemented as a track wrapper, the interface would be bit elegant.
Does it require to wrap PeerConnection?

func main() {
       ...
       s, err := mediadevices.GetUserMedia(mediadevices.MediaStreamConstraints{
		Audio: func(c *mediadevices.MediaTrackConstraints) {},
		Video: func(c *mediadevices.MediaTrackConstraints) {
			c.Width = 640
			c.Height = 480
		},
	})

        x264Params, _ := x264.NewParams()
        rtpTracker, _ := mediadevices.NewRTPTracker(x264Params, ...(more codecs))

        for _, tracker := range s.GetTracks() {
		_, err = pc.AddTransceiverFromTrack(rtpTracker.Track(tracker)),
 			webrtc.RtpTransceiverInit{
				Direction: webrtc.RTPTransceiverDirectionSendonly,
			},
		)
	}
}

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

I'm not sure if this is possible. Currently, it's not clear if the new track interface will have access to the negotiated codec. Also, we need to register our codecs too so that PeerConnection can negotiate the available codecs with the other peer.

from mediadevices.

at-wat avatar at-wat commented on August 19, 2024

Umm, I see. It seems not only a problem of mediadevices.

https://github.com/pion/webrtc/blob/master/examples/broadcast/main.go
For example, broadcast example assumes that subscribers always support codec used by the broadcaster, but this also should have a codec negotiation logic. (in this case, it should just validate that broadcasted codec is supported or not.)
And, payload type (number) determination is also a part of codec negotiation, which was done by application code via MediaEngine on v2.
It seems like a role of webrtc package.

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

Right, so that's why I'm thinking to wrap PeerConnection so that we can register our codecs. AFAIK, we're trying to remove MediaEngine in v3. So, I assume that it'll somehow get exposed in PeerConnection?

I don't think the new track interface will expose the codec registration since it needs to be done at the initiation time.

Also, we can't really put the encoding part in webrtc package since it assumes the media data comes out/in in encoded form. But, it's possible if we can come up with an idea to support both cases.

from mediadevices.

at-wat avatar at-wat commented on August 19, 2024

In WebAPI, it seems done in the layer of RTCRtpReceiver and RTCRtpSender.

https://developer.mozilla.org/en-US/docs/Web/API/RTCRtpReceiver#Methods

RTCRtpReceiver.getParameters()

Returns an RTCRtpParameters object which contains information about how the RTC data is to be decoded.

https://developer.mozilla.org/en-US/docs/Web/API/RTCRtpSender#methods

RTCRtpSender.getParameters()

Returns a RTCRtpParameters object describing the current configuration for the encoding and transmission of media on the track.

RTCRtpSender.setParameters()

Applies changes to parameters which configure how the track is encoded and transmitted to the remote peer.

Parameters given to setParameters contain encoding information.
https://developer.mozilla.org/en-US/docs/Web/API/RTCRtpSendParameters/encodings

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

Wow! I didn't know this existed. In this, we can put the negotiation part in webrtc! Borrowing your idea, the flow should look like,

func main() {
       ...
       s, err := mediadevices.GetUserMedia(mediadevices.MediaStreamConstraints{
		Audio: func(c *mediadevices.MediaTrackConstraints) {},
		Video: func(c *mediadevices.MediaTrackConstraints) {
			c.Width = 640
			c.Height = 480
		},
	})

        x264Params, _ := x264.NewParams()
        rtpTracker, _ := mediadevices.NewRTPTracker(x264Params, ...(more codecs))

        for _, tracker := range s.GetTracks() {
                // peerconnection should also add codec availabilities under the hood for this particular track by using RTCRtpSender.getParameters()
		_, err = pc.AddTransceiverFromTrack(rtpTracker.Track(tracker)),
 			webrtc.RtpTransceiverInit{
				Direction: webrtc.RTPTransceiverDirectionSendonly,
			},
		)
	}

        // peerconnection should negotiate with the other peer
        // peerconnection should set the negotiated codec with RTCRtpSender.setParameters()
}

from mediadevices.

at-wat avatar at-wat commented on August 19, 2024

Umm, in my understanding, RTPSender is created by PeerConnection in WebAPI. In this case, we need to customize RTPSender to receive negotiated codec informations. It can be specified via RTPTransceiverInit as a pion extension.

from mediadevices.

at-wat avatar at-wat commented on August 19, 2024

pion/webrtc-v3-design#10

from mediadevices.

lherman-cs avatar lherman-cs commented on August 19, 2024

Implemented in 2f5e4ee

from mediadevices.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.