GithubHelp home page GithubHelp logo

Comments (60)

richtr avatar richtr commented on August 12, 2024

That would be great @foolip!

from mediasession.

marcoscaceres avatar marcoscaceres commented on August 12, 2024

So, writing the readme I tried to consolidate the use cases from:

  1. @richtr's proposal.
  2. Mozilla's input + some old threads on this topic.
  3. @borismus' article.
  4. @tabatkins and @domenic's feedback over on Rich's repo about extensible web.

If I missed anything, let me know. Let's move to doing PRs from this moment forward.

from mediasession.

foolip avatar foolip commented on August 12, 2024

I've created a new team for this repo and added @richtr. Rich, do you want to take a stab at integrating our use cases? I think that having detailed descriptions of the desired end-user experience in some order of priority helps.

from mediasession.

domenic avatar domenic commented on August 12, 2024

@marcoscaceres looks good regarding the extensibility section---I like the presenter use case especially. Thanks.

from mediasession.

richtr avatar richtr commented on August 12, 2024

Unless it's possible to obtain remote control events when no audio is playing out then I suspect the presenter use case may become out of scope.

On iOS you must have audio playing out to receive remote control events. There are of course clever workarounds (e.g. https://stackoverflow.com/questions/10885047/receive-remote-control-events-without-audio) which could work equally well on the web platform for the presenter use case.

from mediasession.

marcoscaceres avatar marcoscaceres commented on August 12, 2024

On Thursday, February 5, 2015, Rich Tibbett [email protected]
wrote:

Unless it's possible to obtain remote control events when no audio is
playing out then I suspect the presenter use case may become out of scope.

It's why I said, "when possible". It wouldn't probably make sense in the
context of iOS, for instance - but makes total sense in MacOS or Windows.

On iOS you must have audio playing out to receive remote control
events. There are of course clever workarounds (
e.g.stackoverflow.com/questions/10885047/receive-remote-control-events-without-audio)
which could work equally well on the web platform for the presenter use
case.


Reply to this email directly or view it on GitHub
#1 (comment).

from mediasession.

marcoscaceres avatar marcoscaceres commented on August 12, 2024

Ok, so another reason this should not be exclusively hard bound to audio and video right now is because the following sites use Flash for media playback:

  • Spotify.
  • soundcloud
  • reverbnation.com

Having the ability to route the events to an independent object would be a start.

Also, it seems jPlayer, and the sites that rely on it, would benefit tremendously from this functionality. It is used on:

  • Pandora
  • BBC
  • player.fm

JPlayer can compose playlists of both audio and video. I've ping @maboa to comment.

from mediasession.

foolip avatar foolip commented on August 12, 2024

With @richtr I've been arguing for this position, that if the "focus-holder" is a standalone object then it will be easier to implement and possible to make work with Web Audio API and even Flash. However, there are two hurdles:

  • It seems like iOS requires media playback to begin before granting focus. This is at odds with completely detaching the audio focus concept from any audio-producing object. I will email some folks at Apple and ask them if they have any ideas for how an API should work.
  • If Web apps are at liberty to handle a pause event by e.g. gently fading out or playing out the current sound effect, then we'll have a very hard time enforcing silence. Do we give up that possibility?

from mediasession.

marcoscaceres avatar marcoscaceres commented on August 12, 2024
  • It seems like iOS requires media playback to begin before granting focus. This is at odds with completely detaching the audio focus concept from any audio-producing object. I will email some folks at Apple and ask them if they have any ideas for how an API should work.

It would be fantastic if you could reach out to some Apple folks. To be clear, what @domenic and @tabatkins suggested with regards to extensibility is that the API be able to work with both imperative and declarative models. This could mean that, for instance, HTMLMediaController implements or extends FooMediaKeyReceiver, but it is possible to create an instance of FooMediaKeyReceiver.

I'm totally speculating here, but it might be that if you are building a Audio/Video player, you should be expected to hand over a HTMLMediaElement to the OS through the remoteControls attribute (and that element becomes the event target\focused-media-event receiver). At the same time, if you are building something like RevealJS, you instantiate a FooMediaKeyReceiver and I guess hope for the best. Events are never guaranteed anywhere - but for, say, Desktop OSs that could support it, it could work.

  • If Web apps are at liberty to handle a pause event by e.g. gently fading out or playing out the current sound effect, then we'll have a very hard time enforcing silence. Do we give up that possibility?

No, I don't think we should give that up (as the above) - and I have no strong opinions right now, just want to make sure we explore many options (fail quickly, fast, land at something awesome). I don't yet have a coherent model in my head about how this is going to work - but I think it's good to see if we can sensibly generalize this... if OSs won't let us, then we can live with that. But I think there is enough evidence and demand for general remote control events that we should consider it.

Having said that, we may again throw the problem at DOM3 (as they really are key events), but that's not yielded much happiness so far.

from mediasession.

foolip avatar foolip commented on August 12, 2024

I've sent email to @eric-carlson and @jernoble directing them to this issue.

from mediasession.

foolip avatar foolip commented on August 12, 2024

I don't think these are really key events in every aspect. In particular, when one media player starts playing and causes another to pause, some event is delivered to that other player which isn't caused by input on that page.

from mediasession.

marcoscaceres avatar marcoscaceres commented on August 12, 2024

I don't think these are really key events in every aspect. In particular, when one media player starts playing and causes another to pause,

Note that this isn't the case if I'm running both Spotify and iTunes on MacOS. We probably wouldn't expect the browser to pause Spotify or iTunes. Note I'm talking about desktop browsers here.

However, it makes total sense on an OS that has a kind of centralized media player - such as those on Android and iOS.

some event is delivered to that other player which isn't caused by input on that page.

Sorry, I'm not following this bit; can you please clarify what you mean?

from mediasession.

foolip avatar foolip commented on August 12, 2024

OK, different platforms have different conventions. It is possible to play audio in multiple apps at the same time in Android, so it isn't just a question of a platform limitation, though.

The example, on any platform you like: I'm listening to music in a browser tab. Then I start to watch a movie or join a video conference in another tab. When that starts, which may not involve me touching any input device, the music should pause. In order to pause it must either be forcibly paused or some event must be delivered kindly asking it to pause.

from mediasession.

foolip avatar foolip commented on August 12, 2024

Actually, not should pause, that doesn't always make sense. It should be possible to have it pause when it is no longer the audio focusee.

from mediasession.

marcoscaceres avatar marcoscaceres commented on August 12, 2024

I was going to say, that in a browsing environment we have a choice about how this should work (@tobie has some opinions about this too). It might be that if a tab ops-into receiving the media-key events, part of the API contract is that you will get paused if some other tab gains the media focus.

from mediasession.

mahemoff avatar mahemoff commented on August 12, 2024

Excited to see this spec!

In general, I'm satisfied with iOS and there's really not much to do here. It might be different if apps are using Advanced Audio, but for normal video and audio tags, the low-hanging fruit is Android and desktop OS (and probably other mobile OSs). You'd have to do a lot of work on those before it's worth even considering iOS imo.

I should note that there are problems in general with media keys on standard platforms, control devices, and streaming protocols like Google/Chrome Cast and Airplay. The most glaring omission is the typical lack of a skip feature, ie can switch to next/previous track but not move within the track. The omission probably comes from the fact these standards were born around music, and it becomes a lot more of an issue with videos and podcasts. Other missing controls include subtitle toggle and speed control.

Having noted those anomalies for the sake of completeness, my suggestion would be to ignore them, and instead focus on letting web apps do what native apps can do rather than trying to leapfrog native capability. Also, as I mentioned, a lot of this is baked in at hardware level. Headphone controls, keyboards, steering wheels. They simply lack those extra buttons or any ability to customise. It could certainly be supported in some cases, e.g. using Function keys, but again, not the low-hanging fruit for web apps.

(Just to clarify about the jPlayer comment above, Player FM no longer uses jPlayer. It did its job well for a long time, but the main value prop for us initially was Flash fallback, no longer a major concern, and opted to DIY the player when it came time for a site redesign. There's an internal player API which detaches view from model and would make it trivial to support Media Keys standard. I hope to do so.)

from mediasession.

foolip avatar foolip commented on August 12, 2024

@mahemoff, what do you mean about iOS, does it already work such that you can have Web apps playing music respond to the headphone button and pause when something else starts playing?

from mediasession.

Jon889 avatar Jon889 commented on August 12, 2024

@foolip When something else starts playing then it will take over control of the iOS media controls. Unlike desktop OSes you can't play multiple things at once.

from mediasession.

tobie avatar tobie commented on August 12, 2024

I like the idea of a media focus concept different and independent of the main focus. I'd naively define it as: the focused browsing context unless there's a background browsing context which has captured the media focus (e.g. by playing a media). There, of course, can only be one media focused browsing context at a time.

from mediasession.

foolip avatar foolip commented on August 12, 2024

Thanks @Jon889. I suppose that enforced pausing must also be incorporated somehow then, a spec cannot say "if this event is not handled playback will continue in parallel with the new audio stream" or similar.

from mediasession.

Jon889 avatar Jon889 commented on August 12, 2024

@foolip, what do you mean by enforced pausing?

from mediasession.

mahemoff avatar mahemoff commented on August 12, 2024

@foolip @Jon889 Yeah, it would probably surprise a lot of web developers, but iOS has had excellent support for web media for a long time as it's treated the same way as native. Lock-screen, Control Center, and Bluetooth controls work seamlessly. If other OSs did it, this spec wouldn't be needed.

from mediasession.

mahemoff avatar mahemoff commented on August 12, 2024

About media focus - it feels like that may be beyond the scope of this spec. It's a complicated area. Apps can declare themselves to play in the background or not. Audio focus can be transient or not. Even within transient audio events, some can indicate "ducking" (playing in parallel) is okay and some can insist on playing exclusively. Is all this in scope for a spec on media controls?

from mediasession.

foolip avatar foolip commented on August 12, 2024

@Jon889, I mean stopping playback regardless of what the page does with whatever events are are delivered to it.

from mediasession.

foolip avatar foolip commented on August 12, 2024

@mahemoff, if what iOS does is in fact good and sufficient, then we should stop right now and have this fixed in operating systems and browsers without introducing any new APIs to the Web. For simple playing and pausing in response to button presses I agree that one can do it, and in fact Opera for Android has such a feature. Forward and back buttons can't work without the cooperation of the page, however.

I think it's necessary to include some concept of audio focus, because without it where do deliver the events when nothing is playing? The last thing that was playing should continue playing.

from mediasession.

tobie avatar tobie commented on August 12, 2024

The last thing that was playing should continue playing.

  • Or should the page which has visual focus start playing instead if it has registered handlers?
  • Or should there be some time-lapse during which media focus is kept by a background tab?
  • Or other heuristics to determine whether media focus is kept by the background tab or lost?
  • Should such heuristics be implementation-specific?

Also, what exactly does "playing" mean in the context of e.g.: a slideshow, a slide deck while presenting, etc. (hence the suggested notion of media focus capture).

from mediasession.

marcoscaceres avatar marcoscaceres commented on August 12, 2024

@mahemoff, I agree that what iOS does gets us most of the way there - and works quite nicely. However, as @foolip said, forward/back still needs to work in coordination with the page (they obviously don't work in the example below). Also, currently, it's a bit sad what ends up on the lock screen (i.e., the URL, instead of a title of media, no artwork, etc.):

screenshot 2015-02-05 23 51 06

Where the equivalent in the web app is:

screenshot 2015-02-05 23 54 04

So while maybe play/stop are transient things, just forward and back.

So, I also agree with @mahemoff that just getting parity with native would be great - we might not need to do anything with regards to media focus on mobile if it's mostly just handled at the OS level.

from mediasession.

richtr avatar richtr commented on August 12, 2024

As @foolip says, we support media notification features in Opera Mobile for Android today. I don't think we've explored this feature as far as Apple have in iOS but there is obviously a lot we can do already without additional hooks or APIs being added to the web platform.

We could try to enable that iOS behavior everywhere but that assumes a.) all A/V content is created equally and b.) we will enforce only one media element playing out at any given time across platforms. To balance this we are therefore proposing to let web developers opt-in to that behavior by declaring remoteControls against the HTML media content they want to be subject to this kind of behavior.

We could later on have more lower level APIs here but web developers would probably appreciate us exposing consistent simple media key behavior across platforms. That's why we drafted HTML Media Focus.

So there's also a question of what the scope of this work should be. Would it be useful if we could initially support something very similar to the iOS behavior across platforms or should we target a lower level integration from the start?

from mediasession.

mahemoff avatar mahemoff commented on August 12, 2024

@marcoscaceres Yes - again, it's a question of scope, if you consider media metadata to be part of this. It's certainly a lot less complex than audio focus and I'd be happy to support it.

@richtr This is a very sensible approach, let developers opt into it and support it automagically. Great wins - no implementation effort required and all web apps act consistently.

The hard part is already done, standardising the media API, so it should be easy enough to build on that.

from mediasession.

marcoscaceres avatar marcoscaceres commented on August 12, 2024

@richtr the thing to consider is that there is already web content relying on the iOS behavior. The current behavior needs to work irrespective of any remoteControls attribute on HTML elements.

from mediasession.

marcoscaceres avatar marcoscaceres commented on August 12, 2024

@foolip, @richtr, what do you guys think? Is metadata in scope? My vote from Mozilla is "yes".

from mediasession.

richtr avatar richtr commented on August 12, 2024

@marcoscaceres wrote:

it's a bit sad what ends up on the lock screen (i.e., the URL, instead of a title of media, no artwork, etc.)

If the web page sets a title attribute on the HTMLMediaElement then that gets displayed instead of the URL. If you play e.g. a YouTube video you'll get the video's title attribute being shown there. We could also use sensible fallback metadata like the document's title too, of course.

@marcoscaceres wrote:

@richtr the thing to consider is that there is already web content relying on the iOS behavior. The current behavior needs to work irrespective of any remoteControls attribute on HTML elements.

The platform would have an intent from the web developer for a certain type of functionality. Whether it respects that intent is up to itself but the presence of that intent would not degrade the current iOS user experience.

@foolip, @richtr, what do you guys think? Is metadata in scope? My vote from Mozilla is "yes".

If we are targeting notification bars, home screen controls and lock screen controls then I would also vote 'yes' on having metadata in scope.

from mediasession.

foolip avatar foolip commented on August 12, 2024

@tobie, I don't think we should go for implementation-specific heuristics unless there's something where's there can be reasonable but different implementations of the same thing. "Playing" in my example could mean that the media element's paused attribute is false and the playing event is fired, or something entirely different depending on what the model is.

I think it's important to start from use cases expressed in terms of what an end-user is doing and experiencing, in some order of importance. https://github.com/richtr/html-media-focus/blob/gh-pages/USECASES.md is what I came up with in an email to @richtr originally.

from mediasession.

domenic avatar domenic commented on August 12, 2024

@marcoscaceres

Note that this isn't the case if I'm running both Spotify and iTunes on MacOS. We probably wouldn't expect the browser to pause Spotify or iTunes. Note I'm talking about desktop browsers here.

FWIW on Windows 8 I would expect the exact opposite. When I am playing Spotify and have the windows Music app open and focused, and I press the play/pause key on my keyboard, Spotify pauses and Music starts playing. I would expect that if I had a web page that had registered as "I would like media focus" open while Spotify was playing, pressing play/pause should pause Spotify and start the web page.

from mediasession.

foolip avatar foolip commented on August 12, 2024

Sounds like a detailed analysis of the behavior on each platform would be in order. And checking what the native APIs for all this look like. This will probably constrain us a lot.

from mediasession.

foolip avatar foolip commented on August 12, 2024

As for metadata, my podcast player (DoggCatcher) has a lock screen UI where back and forward buttons actually skips back or forward 10 seconds, and a notification UI with the same, plus a "mark episode as done and skip to next" button. I'm very skeptical that anything less than full control over the rendering of those UIs and a communication with the original page is going to be good enough. Metadata would amount to a declarative solution and I don't think it can be on parity with native platforms.

from mediasession.

tobie avatar tobie commented on August 12, 2024

@foolip re use cases: filed richtr/html-media-focus#4

from mediasession.

mahemoff avatar mahemoff commented on August 12, 2024

@foolip Whatever you see on the Android lock-screen is standard OS UI. There's an artwork image and title associated with the current episode, which is what Android renders. You comment about skipping within and skipping between tracks makes me think you're referring to the app's own full-screen player, as it's not possible to do both on Android. The app can respond to events (so it can mark an episode played for example) and interpret them, but it can't change the UI.

In other words, a browser can automatically implement Android screen-lock for any audio/video element as long as:

  • It can get hold of metadata. The obvious way to do this is extend the audio/video element definitions to include metadata such as standard artwork, album title, track title. Doing so is a more general solution which could be used for other features too.
  • It passes events back to the playing web page (and in the process, ensures the web page remains open - not just copying the URL and streaming it directly from the cloud).

from mediasession.

foolip avatar foolip commented on August 12, 2024

OK, maybe the lock screen layout is standard. It's apparently up to the app how to handle the previous/next events, though. If the iOS lock screen is equally declarative then it sounds like having metadata would be the only way to deal with it.

The notification tray UI is not standard. I know this because we had to write our own in Opera for Android. This is also how Doggcatcher can have a special button in this UI that isn't in the lock screen UI.

Not sure what the app's full-screen player would be, AFAICT Doggcatcher never enters fullscreen.

from mediasession.

jernoble avatar jernoble commented on August 12, 2024

A brief clarification on the iOS model. Generally speaking, each app has a single shared instance of an AVAudioSession. That shared session has a category describing how it interacts with other app's audio. Some categories allow audio to play in multiple apps simultaneously (a.k.a. "are mixable"), such as AVAudioSessionCategoryAmbient. Others will cause playback to interrupt other non-mixable sessions, like AVAudioSessionCategoryAmbientSolo and AVAudioSessionCategoryPlayback.

iOS WebKit uses AVAudioSessionCategoryPlayback for <audio> & <video> and AVAudioSessionCategoryAmbient for Web Audio. So starting playback of a <video> element will interrupt Spotify and Music. But starting a Web Audio context will not.

These AVAudioSessions are kept in a system-wide "most-recently-active" list.

There is another per-app object called MPRemoteCommandCenter which manages what remote commands the app supports. iOS WebKit supports playCommand, pauseCommand, togglePlayPauseCommand, seekBackwardCommand, and seekForwardCommand. Other interesting commands are nextTrackCommand, previousTrackCommand, likeCommand, dislikeCommand. Commands are sent to the app with the most-recently-active AVAudioSession. iOS WebKit keeps its own list of most-recently-active <audio> and <video> element, and further routes the commands to the most-recently-active one.

from mediasession.

mahemoff avatar mahemoff commented on August 12, 2024

@foolip

"It's apparently up to the app how to handle the previous/next events, though"
Agree. For example, this lets a number of podcast apps have a setting to assign << and >> to seeking within track instead of jumping to next/previous.

"The notification tray UI is not standard."
Agree. Rich notifications on Android are custom UI with custom (simple) interactions.

"Not sure what the app's full-screen player would be"
A lot of music and podcast apps have their own full-screen player mode which usually launches when you start playing something. Not really relevant to media keys, just wondered if that's what you meant (nope).

from mediasession.

mahemoff avatar mahemoff commented on August 12, 2024

@jernoble That's really interesting iOS supports parallel audio and video like that. I think in most cases that would not be desirable behaviour; it would be usually better to treat them as competing for the same focus.

from mediasession.

jernoble avatar jernoble commented on August 12, 2024

@mahemoff And usually, it does treat them as "competing for focus". Most media playback is going to be simple video or music playback, which would get a AVAudioSessionCategoryPlayback category, and those playback sessions would interrupt previous ones.

from mediasession.

jernoble avatar jernoble commented on August 12, 2024

@foolip

OK, maybe the lock screen layout is standard.

To continue with my iOS model explanation, the lock screen layout on iOS is not "fixed" per se. If the app with the most-recently-active session signs up to handle "togglePlayPause", "seekForward", and "skipBackwards" commands, the lock screen will display "⟲►↠" (i.e., a podcast-like set of controls). If instead it supports "dislike", "like", togglePlayPause", and "nextTrack", it will display "★►→" (i.e., a Spotify-like set of controls).

from mediasession.

yvg avatar yvg commented on August 12, 2024

Hey, thanks for launching this initiative and pointing me to it, much appreciated.
So I've been reading through your comments and the initial discussion, and would argue that the simpler, and the most decoupled from Audio/Video specs, the better.
While content providers try to migrate all their content to HTML5 Audio/Video, @marcoscaceres already pointed out that some applications still rely on 3rd-party technologies such as Flash, Silverlight,… there are many reasons for that, and stretch from technology to legal issues. A solution that doesn't allow to interact with content from such sources would fail immediately since almost every content provider still relies on these technologies a little or heavily to allow people to stream audio and/or video especially on platforms such as desktop/laptops and TVs .

Before releasing a draft and/or MVP we need to answer these questions imho:

  • What happens when no app registered to control the keys on mobile, on desktop, on TV, but you have opened an App that could potentially play and hit the play/pause media key?
  • Do we need to trigger playback manually by touching/clicking a button to register an application when another app is already playing? (I personally would be in favour of this, this is what Android & iOS do, the WP8 approach might be very confusing in a desktop environment when all you wanted to do was to pause an already playing track from another service)
  • Do we want to queue tracks in order to take benefit of prev/next? And if so, exclusively from the service that controls playback?

Metadata, UI customisations, and implementation details will be easier to consider once we have consensus on a basic set of user interactions like these.

from mediasession.

jernoble avatar jernoble commented on August 12, 2024

@yvg

...3rd-party technologies such as Flash, Silverlight...

Flash and Silverlight already have access to their platform's remote commands by virtue of being plugins. There's nothing stopping Flash from adding support for hardware media controls and providing it as API within their own runtime.

from mediasession.

marcoscaceres avatar marcoscaceres commented on August 12, 2024

Flash and Silverlight already have access to their platform's remote commands by virtue of being plugins.

@jernoble, I was not aware of this. I tried finding information about this API (for flash), but couldn't find anything :( Do you have a link?

There's nothing stopping Flash from adding support for hardware media controls and providing it as API within their own runtime.

This statement holds true. However, given the decline of both technologies, and commitments by both Microsoft and Adobe to the Web (+end of life of Silverlight), I'm betting this would be unlikely. Also, if we standardize this in browsers, it might encourage more media apps that rely on Flash to move to the Web.

from mediasession.

foolip avatar foolip commented on August 12, 2024

Thanks for that lock screen explanation @jernoble, it sounds like even though it isn't fixed it's still far simpler than doing a custom Web-based UI. BTW, are you happy with this model, or would you want something different if you were to implement a media keys/focus API in WebKit for iOS?

If supporting the iOS lock screen model is a goal, then there aren't too many ways of designing an API. We would need (1) a way to sign up to handle a set of lock screen actions (play/pause/skip/like/etc.) (2) some place to deliver the events when those actions are taken and possibly (3) a way to query which actions are supported, for feature detection.

from mediasession.

yvg avatar yvg commented on August 12, 2024

@jernoble

Flash and Silverlight already have access to their platform's remote commands by virtue of being plugins. There's nothing stopping Flash from adding support for hardware media controls and providing it as API within their own runtime.

Yes, my point was to think about providing a mechanism that is independent from playback technology, which acts as a bridge and provides means to application authors to handle playback themselves with the technology they use. Using an existing event based API, e.g. adding new key codes or providing a new API would be sufficient to achieve this.

from mediasession.

richtr avatar richtr commented on August 12, 2024

@yvg

my point was to think about providing a mechanism that is independent from playback technology, which acts as a bridge and provides means to application authors to handle playback themselves with the technology they use.

Is it important for the UA to be able to enforce media playback, pausing, seeking and resuming or should we expect web developers will always do the right thing and pause, seek and resume their media when they are told to pause, seek and resume via a decoupled API?

A strongly-bound API would enable a UA to enforce logical behavior. A loosely-bound API is likely to lead to a poor and inconsistent user experience across different web pages.

from mediasession.

jernoble avatar jernoble commented on August 12, 2024

@marcoscaceres

I was not aware of this. I tried finding information about this API (for flash), but couldn't find anything :( Do you have a link?

Oh, I don't believe they've added this API yet, but they could. On OS X, at least, Flash & Silverlight run as a native plugin, with all the same access to system APIs as the browser. (Modulo sandboxing, of course.) So it has the same access to system-level remote control commands as browser vendors do.

This may not be the case with Chrome and their PPAPI-based plugins, however.

However, given the decline of both technologies, and commitments by both Microsoft and Adobe to the Web (+end of life of Silverlight), I'm betting this would be unlikely.

I agree, but that also makes the case against including plugins in the scope of this feature. They're not going to do the work necessary to support this feature anyway.

from mediasession.

jernoble avatar jernoble commented on August 12, 2024

@yvg

Yes, my point was to think about providing a mechanism that is independent from playback technology, which acts as a bridge and provides means to application authors to handle playback themselves with the technology they use. Using an existing event based API, e.g. adding new key codes or providing a new API would be sufficient to achieve this

Except that authors who use Flash are going to want this support inside their Flash application, not out in the DOM in JavaScript.

Flash does not currently support a "play()" or "pause()" API on Flash objects from JavaScript. I'm doubtful they'd add one just for this feature.

from mediasession.

jernoble avatar jernoble commented on August 12, 2024

@foolip

If supporting the iOS lock screen model is a goal, then there aren't too many ways of designing an API. We would need (1) a way to sign up to handle a set of lock screen actions (play/pause/skip/like/etc.) (2) some place to deliver the events when those actions are taken and possibly (3) a way to query which actions are supported, for feature detection.

Assuming for the moment that is a goal, wouldn't this be as simple as (1) defining a new message for each remote command we want to support? (2) The event would be fired at the "focused" media element. (3) I'm not sure this is necessary if we're just using event names. You would listen for all the events you would like to handle, and the UA would decide what subset of those to display on the lock screen. Alternatively, clients could do feature detection by checking, e.g., typeof(video.onremotecontrolplaycommand) === 'undefined', if we defined EventListener properties for these events on HTMLMediaElements.

from mediasession.

yvg avatar yvg commented on August 12, 2024

@richtr

A strongly-bound API would enable a UA to enforce logical behavior. A loosely-bound API is likely to lead to a poor and inconsistent user experience across different web pages.

As much as I would love this to be true, the reality is that, the biggest content providers still rely on Flash and other technologies for playback, providing an API that doesn't allow authors to trigger playback on such technologies would lead to an unusable implementation, and authors would most certainly ignore it.
This is true for platforms like SoundCloud, YouTube, Rdio, Spotify, Deezer, etc… where Flash is used side by side with HTML5 and not going to disappear any time soon.
For example in SoundCloud's architecture triggering a play event is performed on a source wrapper, who acts as a bridge to different technologies.

from mediasession.

jernoble avatar jernoble commented on August 12, 2024

@yvg

As much as I would love this to be true, the reality is that, the biggest content providers still rely on Flash and other technologies for playback

This is to support legacy browsers which do not support native <audio>, <video>, and Web Audio features. And these legacy browsers are never going to add APIs for remote control events.

This is true for platforms like SoundCloud, YouTube, Rdio, Spotify, Deezer

SoundCloud, Youtube, and Rdio all currently work without Flash installed (In Safari on OS X). Spotify is not a web-app, it's a native app. And Deezer is not available in the US (edit: so I can't check whether it works without Flash installed).

from mediasession.

yvg avatar yvg commented on August 12, 2024

This is to support legacy browsers which do not support native , and Web Audio features. And these legacy browsers are never going to add APIs for remote control events.

Not really, this is to support features that browsers aren't providing yet, or with which authors aren't satisfied yet, e.g. adaptive streaming, live streaming, protected content, etc

SoundCloud, Youtube, and Rdio all currently work without Flash installed (In Safari on OS X).

They do partially, YouTube still serves Flash as a default to Firefox, it also relies on Flash for live streaming, Flash is currently used by SoundCloud to support content over RTMP, etc

Spotify is not a web-app, it's a native app. And Deezer is not available in the US.

Spotify has a webapp: play.spotify.com, and Deezer is available in the US, not to mentioned that this isn't an argument to begin with as the Web crosses borders and a spec we come up with must work independently from your geographical position.

from mediasession.

jernoble avatar jernoble commented on August 12, 2024

@yvg

Not really, this is to support features that browsers aren't providing yet, or with which authors aren't satisfied yet, e.g. adaptive streaming, live streaming, protected content, etc

Those features are all present in the web platform. We shouldn't spec additional features to work around browsers who haven't implemented existing features.

They do partially, YouTube still serves Flash as a default to Firefox, it also relies on Flash for live streaming, Flash is currently used by SoundCloud to support content over RTMP, etc

YouTube uses Media Source Extensions to serve live streaming video on platforms which support it. So the solution is for UAs to implement (or improve) those existing features.

Spotify has a webapp: play.spotify.com, and Deezer is available in the US, not to mentioned that this isn't an argument to begin with as the Web crosses borders and a spec we come up with must work independently from your geographical position.

Ah and I see play.spotify.com requires Flash. Well, as I said earlier, they can convince Adobe to add remote control support to the Flash runtime. As I explained in my edit, my only point about Deezer was that I couldn't verify whether it worked without Flash installed.

from mediasession.

foolip avatar foolip commented on August 12, 2024

@jernoble, yes, I think it's about that simple and that sounds fine, constraining the possible design isn't necessarily a bad thing :) We would have to have a look at how lock screen controls on Android work. At the very least it also supports adding album art as the background, but I don't know if it's a declarative thing or if everything is custom UI.

from mediasession.

foolip avatar foolip commented on August 12, 2024

As for Flash, I previously argued with @richtr that all else equal it would be good with an API that stands on its own and could be used to control media elements, Web Audio or even Flash. However, all else does not seem to be equal, as at least iOS seems to require that playback begins before granting audio focus, if that's the right terminology.

I don't know how to blend that with Flash, where all the browser can tell is that audio is being produced. If that were sufficient, then it would be sufficient for non-Flash content as well and we wouldn't need any new API surface.

@jernoble, do you think tying into the HTMLMediaElement playing event or similar would be a good match for implementing on top of iOS, or how would you deal with this?

from mediasession.

marcoscaceres avatar marcoscaceres commented on August 12, 2024

Closing this issue as use cases discussed here are covered in the README. Please send further use cases in the form of bugs or pull requests.

from mediasession.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.