iiif / iiif-av Goto Github PK

The International Image Interoperability Framework (IIIF) Audio/Visual (A/V) Technical Specification Group aims to extend to A/V the benefits of interoperability and the growing ecosystem of clients and servers that IIIF provides for images. This repository contains user stories and mockups for interoperable A/V content – contributions are welcome.

Home Page: http://iiif.io/community/groups/av/

License: Apache License 2.0

iiif-av's People

Contributors

Stargazers

Watchers

Forkers

edsilv anasghrab sshyran

iiif-av's Issues

Virtual timeline to annotate across media resources

Use Cases

Split up the media into separate files, but should be played back to be perceived as a single stream
Playback of multiple resources overlapping in time, such as music and a video at the same time.

Proposed Solutions

Use a SC Canvas extended with time to associate the resources with. The time dimension on the Canvas provides the virtual timeline.

Additional Background

Source: BL workshop notes
Interest: 100%

Custom media players for file types/delivery platforms that aren't directly supported

Description

Custom media players for file types that aren't directly supported, or subhosted via youtube. Something distinct from just a video file, but actually an iframe.

Variation(s)

Annotations on custom content (e.g. annotate youtube) -- requires enough API to get time update events
Control the embedded player from surrounding UI (eg, side-by-side linking to common points in two alternate videos) -- requires enough API to start/stop and seek. Description: Annotate content where the content can't be exposed to the annotation tool. Control of a player that has the bits, and annotation part can't see it. Esp for commercial content providers. [related to iframe player]
Player doesn't understand the content, but can be annotated.

Additional Background

Source: BL workshop notes
Interest: 90%

Technical discussion

Issue: postMessage() to communicate with iFrame; maybe standardize interface? Would be super valuable. Look at youtube API. Likely need a way to describe the interface supported. This may need a separate spec; see quick survey of state of art.
Initial implementations could probably hardcode use of a few site-specific APIs such as YouTube’s to demonstrate basic effectiveness.

[If folks are interested in further followup on iframe embedding standardization, see https://github.com/iframe-player/iframe-api-docs where @brion will be collecting details and trying to organize something. Feel free to add ideas there as issues. Will try to make sure editable pages are up or can add people.]

Use of profile for the API to use, e.g. for youtube?
Multiple remote services, like multiple formats. Toss in a YouTube link, and a local MP4, and let the player decide which to load based on availability.

Scraping YouTube for the raw mp4 would be so much easier in theory… but unreliable, and won’t work with content protection, etc. Would also violate their ToS.

Stability of iframe URLs, parameters? (Seem to mostly be stable for YouTube and Vimeo.)
Extra requirements and oddities -- youtube says “must” be 200x200px (see https://developers.google.com/youtube/iframe_api_reference) What’s that mean? :D

Comparing with http://wellcomelibrary.org/iiif/b16748967/manifest example:

         "rendering": [
            {
              "@id": "https://s3-eu-west-1.amazonaws.com/wdl-video-open/mp4/80e62970-d26a-403e-8aa8-480de6d22495.mp4",
              "format": "video/mp4"
            },
            {
              "@id": "https://s3-eu-west-1.amazonaws.com/wdl-video-open/webm/80e62970-d26a-403e-8aa8-480de6d22495.webm",
              "format": "video/webm"
            }
          ],

Let’s imagine extending that to also include a remote link and an API reference, something like:

            {
              "@id": "https://www.youtube.com/embed/dQw4w9WgXcQ?enablejsapi=1",
              "format": "text/html",
              "profile": "https://developers.google.com/youtube/iframe_api_reference",
            }


Or more simply with a fake type, where the player would know how to add the api-specific iframe parameters:

            {
              "@id": "https://www.youtube.com/embed/dQw4w9WgXcQ",
              "format": "x-video/youtube"
            }

We might want to look to the service pattern to embed the YouTube @id & profile as above (see image annotation resources in the Presentation API). [This could avoid needing a separate info.json for every youtube video you want to embed?]

Note use case given where YouTube video was used preferably but a local file needed to be provided for China. Providing both the embed link and the .mp4, in preferred order, should work same as including an .mp4 and a .webm: player can know that YouTube won’t work locally and skip over it.

John Dyer of MediaElement.js was working on an HTML5 Video Wrapper for the YouTube and Vimeo embedded player APIs (seems to still be maintained as part of MediaElement.js). This is very similar data model, taking the embed URL in a element just like a raw mp4 or webm file reference. Blog post: http://johndyer.name/html5-video-wrapper-for-youtube-and-vimeo-api-mediaelement-js/, Example: http://mediaelementjs.com/examples/?name=youtube
The shim itself: https://github.com/johndyer/mediaelement/blob/master/src/js/me-shim.js

Popcorn.js has something similar: http://popcornjs.org/popcorn-docs/media-wrappers/

Client feature: Display of audio as waveform or sonogram

Description

Client feature: Display of audio as waveform or sonogram. For example, speech and music can be easily seen in waveform. Sonogram for wildlife. [Generalize along with audio postprocessing?] Web audio APIs let you generate this. Way to refer to a waveform file that was pregenerated? https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API

Proposed Solutions

client feature only? Question as to how to implement it efficiently. Could have special service / seeAlso block that delivers the data for the waveform, rather than calculating it from the audio stream. Can something be identified and adopted for this?
Soundcloud API for waveforms: https://developers.soundcloud.com/blog/waveforms-let-s-talk-about-them
http://www.waveformjs.org
http://www.bbc.co.uk/rd/blog/2013/10/audio-waveforms
https://github.com/bbcrd/waveform-data.js/blob/master/README.md

Additional Background

Source: BL workshop notes
Interest: 50%

Content protection

Description

Content protection. View the content but don't need to authenticate. Mostly for IPR issues. HTML5 video player, solution was to use a unique token every time. Play content once with the token. Can go forwards and backwards. But on a download manager you can't get it. Stronger than Image API protection. Would need to webscrape javascript.
Need to pass token to the streaming environment. Could be in URL params, cookie, header etc.

Proposed Solutions

Tags: download
IIIF: Auth API

Issue: Question as to whether a redirecting video/audio element src redirects might cause problems? Otherwise can have a IIIF implementation of the Auth API, either as a shim on top of existing system or as a new system.

Auth API looks like it should work with HTML5 video/audio as long as resource is on the same domain as auth server (cookies), but still allows downloading the local file from the browser once authenticated.

Solution: Specify token service at level 3?
MPEG-DASH and HLS can make it hard to download.

IIIF Tile Harvester in Python:
https://github.com/azaroth42/iiif-harvester/blob/master/tilescraper.py

Additional Background

Source: BL workshop notes
Interest: 60%

Audio manifest only

Description

Audio manifest only. No x,y at all, just duration.

Proposed Solutions

Need to make x,y dims for canvas optional (iff t given)

Additional Background

Source: BL workshop notes
Interest: 100%

Non-linear time alignment

Description

Non-linear time alignment. Different tempos of recording of music. Align points, but not in time. Same score, different audio times. (http://beethovens9thsymphony.touchpress.com/ is a realisation of this use case). See also NYPL Labs video web tool to combine multiple recordings (http://digitalcollections.nypl.org/tools/video/)
Wikipedia use cases. Music / speeches where would want to mark up points from one alternative version to another.
Might be able to be done automatically. Good to look at NYPL offering.
Refer to multiple points at the same time, and be able to state they're the same abstract thing.

Variation(s)

Annotations on video, with multi-camera. Diff positions based on version of video being watched

Proposed Solutions

Annotation that tags multiple canvases to indicate aligned points in the different canvases.

Additional Background

Source: BL workshop notes
Interest: 65%

Refer back to original source from segment

Variation(s)

need to not lose rights statement on source for the segment

Proposed Solutions

Video API @id, can reference info.json for rights. In Presentation API, can have link to service.

Additional Background

Source: BL workshop notes
Interest: 60%

Request (for download) only a particular audio stream from a video.

Proposed Solutions

Level 4! :) Put name of track into the media API slot. Very high in the fruit tree.

Additional Background

Source: BL workshop notes
Interest: 60%

Ability to describe a master file that's not available

Description

Ability to describe a master file that's not available. Might have only delivery of different formats from it. Do publish metadata about it. Want consistent time codes between master and delivery copy. Behind the scenes implementation detail that user doesn't care about. e.g. someone annotated interesting part of derivative video, can then go back to master to re-extract more high quality version.

Scope: ?

Implementation detail? System needs to know the background mappings (e.g. jp2 -> png in Image API, without making jp2 available).

Give access to full quality, but only small segments. Max but for temporal region, rather than size.
Describe the image qualities of a single frame extracted from the underlying video source.
Turn video/canvas+time into image api identifier, and fetch info.json for it. -> relates to Download still frame from video as image use case and how that service is described

Additional Background

Source: BL workshop notes
Interest: 60%

Slideshow view of images with audio

Description

Slideshow view of images with audio. Paint the canvas over time with image annotations.
E.g. Audio podcast about painting, bring up image of painting. Audio with slideshow.
[Related to missing scene] woord.nl

Proposed Solutions

One canvas with images painted for particular time ranges (existing case of no time range means always present over whole duration), alternate solution is to have a sequence with one image each and a segment of the audio.

Additional Background

Related IIIF: Annotation, Canvas

Source: BL workshop notes
Interest: 100%

Tagging of content. e.g. link to streetview of the scenes

Description

(Geo)tagging of content. e.g. link to streetview of the scenes. Movie locations. Or tagging people with identities such as ancestry or orcid. [Related to tagging]

Proposed Solutions

Annotations, motivation: tagging.

Additional Background

Source: BL workshop notes
Interest: 70%

Ability to control A/V playback

Description

Ability to control playback. Should be able to play, pause, skip ahead/back, change volume, adjust balance (over channels).

Variation(s)

Turn off user's ability to control playback.
include this support for remote iframes. (iframe API issue survey)
Know when resource is ready to play. Event when enough downloaded to start playing. Then get new events when playback begins. (HTML5 video provides all this, as does YouTube API.)

Proposed Solutions

For case of HTML5 all of these facilities are supported. Events will be different for iframe scenario. LevelA? LevelImpossible? For now be able to say “this iframe supports youtube/whatever api” for extended clients to support if they want? In the future try to standardize it, but will have to get Google on board probably (eek!). [NB I think it would be extremely useful to have a spec for this that academic-based media providers could implement, even if YouTube is not on board. -Jon D.]

Additional Background

Interest: 100%
Source BL workshop notes

Describe media type, aspect ratio and resolution

Description

Describe media type, aspect ratio and resolution. e.g. if it's 4/3 or 16/9 aspect ratio

Variation(s)

Pixel dimensions. Display aspect ratio can be different from encoded.
Support for different file formats and codecs. At BL MPEG family are most common (MPEG-1 layer 3 audio, and MPEG-4). Some like Wikipedia don't use MPEG: webm for video, ogg for audio. Most things work in most browsers most of the time...
User might care about format for download. Analysis of content would want full bandwidth.
Describe what you have and then let the client choose.
Choosing which compression/format is used. Manual selection might be niche case vs auto-negotiation most common for playback. Cases: spoken word vs music vs audio for analysis.

Proposed Solutions

Properties on info.json.

Additional Background

Source: BL workshop notes
Interest: 100%

Useful to surface the time code that's currently played

Description

Useful to surface the time code that's currently played. Want to trigger events based on time point being played in the browser. HTML5 video can be queried for time points. Popcorn. Offset from start point? Multiple items playing (need global clock)? Custom iframe players (youtube assets in iframe, etc) - how does player get the low-level time?

Proposed Solutions

Easy with regular video stream, as get it from HTML5. Hard with iframe, needs spec.

Additional Background

Source: BL workshop notes
Interest: 100%

Internationalization

Description

Internationalization: Audio streams in welsh / english, along with subtitles in Welsh/English.

Proposed Solutions

Embedded textual content handled via JSON-LD. Options:

Single annotation with a Choice of different resource, by language, as the body.
Layer per-language with annotations with a single body.
Refer to subtitles resource in existing format as the body of an annotation
?? expose existing srt/vtt files and burned-in captions in info.json? (or TTML)

For situations when there are more than one video associated with a canvas, it might be useful to associate the subtitles (etc) with the canvas as individual annotations.
For situations when there is a single media stream, it would be easier to use existing subtitle resources and have the player render them. → Reference from info.json to different language subtitle files (outside of presentation API)
Subtitles that should be rendered by the viewing application in a separate window/panel should be annotations on the canvas rather than a subtitle file.
Transcription (with speaker identification) != captioning
SRT “spec” http://srt-subtitles.com/ & https://en.wikipedia.org/wiki/SubRip - very simple entries: entry number, time range, text (with limited HTML)

Use case for zones: associate timestamps of subtitles with the zone, and then associate zone with the time frame of multiple canvases.
Can associate annotations with the canvas.
Link from video-info.json to a zone.

Additional Background

Source: BL workshop notes
Interest: 100%

Get access to a full video/audio stream and play it

Description

Get access to a full video/audio stream and play it.

Proposed Solutions

Media API equivalent to Image API:
Level0 equivalent is standalone content where either the server doesn’t support byte ranges (!!) or the format makes this useless anyway (one has to download file)
Level1 as more dynamic. Allow range-based seeking within file. (.mp4 or .webm on apache/nginx/etc, or HLS or MPEG-DASH) - expected to be the common baseline
Level2 provide downloads of rewritten subsets of media content on demand.
LevelA use of existing APIs (e.g. youtube/vimeo/soundcloud) -- need different information in the service description for how to interact with the API. Might a different service? info.json equivalent could be also translation table to other existing APIs. (iframe API issue survey)

Additional Background

NB Can seek within a video stream from the client more easily than "seeking" into image. If need to play from A to B in video, only have to load the header and then seek to point A via HTTP ‘Range’.
Note also that some file types don't support seeking. Need to expose seekable/unseekable as metadata (supports: …)?

Source BL workshop

Playback (audio and video) at different speeds

Description

Playback (audio and video) at different speeds. Multiplier on the speed. Open question as to whether its client or server.

Issue: If there are multiple video/audio on a single canvas at the same time, then the multiplier applies to the canvas not the stream, otherwise the time point of the canvas is wrong.

Variation(s)

Whether to resample audio or not. (control in scope?)

Proposed Solutions

Client application concern, not interop.

Additional Background

Source: BL workshop notes
Interest: 100%

Viewer level: user level comparison of video

Description

Viewer level: user level comparison of video (two manifests/sequences or two canvases). (2-up) [does this need synchronized playback between both instances? maybe?]

Proposed Solutions

client feature simple matter of lines of code :)

Additional Background

Source: BL workshop notes
Interest: 75%

Use range/track as target of send to playlist

Description

[See reference use case]

Proposed Solutions

client feature, to send URI of range / track to the playlist thing

Additional Background

Source: BL workshop notes
Interest: 50%

Full text search of transcribed speech

Description

Full text search of transcribed speech. from speech to text, rather than OCR or manual.

Proposed Solutions

Search API for annotations. If it's not annotations, then it's out of scope

Additional Background

Source: BL workshop notes
Interest: 100%

Annotate (refer to) the audio and video channels separately

Description

Annotate (refer to) the audio and video channels separately, even if displayed with both at once and part of a single representation (video stream with audio). Example of annotating the instruments vs technique vs music against different streams. Layers of annotations.

Variation(s)

Allow any level of granularity (left channel vs right channel, welsh/english tracks, audio vs video, musical instrument separation, etc) as part of the description of the target of the annotation.

Proposed Solutions

Use media fragment #track= in the URI and then list the available track names in the video.json. Default name of "video".

"tracks": [
  {"value": "audio", "language": "en", "type": "Audio", "label": {"en": "The English Audio Feed"},
  {"value": "subtitles", ...},
  {"value": "video", ...}
]

Additional Background

Issue: Hard! :)
Source: BL workshop notes
Interest: 80%

Be able to describe navigation and hierarchy

Description

Be able to describe navigation and hierarchy. Tracks /song within album regardless of carrier. Logical structure.

Variation(s)

Description of different segments.
Be able to describe physical carrier structure. Disk, Side/Side. Disk2, Side/Side. DSotM. Ethnographic audio recordings.

Proposed Solutions

Related IIIF: metadata on Range

Additional Background

Source: BL workshop notes
Interest: 100%

Embeddability in twitter et al. Player cards.

Description

Challenge: Twitter needs content to come from a whitelisted host. (A particular presentation player, such as Universal Player, could expose the ‘player card’ interface and be whitelisted, which should allow any AV resource to be playable as long as linked through there.)
https://dev.twitter.com/cards/types/player
Facebook Open Graph:
https://developers.facebook.com/docs/sharing/webmasters#media
https://www.jwplayer.com/blog/publish-your-videos-to-facebook-with-a-jw-player/

Proposed Solutions

Issue: Tricky to get it to work. Consider also sharing segments of video (e.g. segmentation in media api). Support for cards at the API level okay, but need to be approved.

Not clear that there is anything needed for specifications here.

Additional Background

Source: BL workshop notes
Interest: 65%

Auto-play across annotations on canvas

Description

As a lay user, I want to control when the audio stream starts and stops at the individual page level, rather than line by line or note by note, in order to not have to click many times and to experience the full page worth of music.

Variation(s)

(do you know of, or can you imagine, similar use cases?)

Functionality

It should be possible to progress from one audio annotation to the next on a single canvas, in the same as progressing from canvas to canvas with or without a gap as per #12. Thus auto-play annotation two after playing annotation one.

Proposed Solutions

Annotation List could provide a hint as to the continuous-ness of the annotations. Lists are ordered already, so it would be a case of playing from the beginning of the list.

If there are multiple points to break at, this could be multiple lists each with the hint, or the connection could be at the annotation level rather than the list level.

Additional Background

From Shared Canvas demo2, Worlde's Blisce.

Different streams, with a gap, and describing the gap

Description

Related IIIF: Sequence (with some other viewingHint, and maybe some other prop. for length)

If the controller should reset the time to zero, then you need a new canvas in a sequence. If the controller should continue in the same time then it should be a zone on a single canvas.

Canvas for the gap with short duration. viewingHint to say "gap".
For single canvas case, might either have a zone or just leave the time space empty.

Variation(s)

(do you know of, or can you imagine, similar use cases?)

Proposed Solutions

(any ideas about how your use case might be supported)

Additional Background

Source: BL workshop notes
Interest: 100%

Download still frame from video as image

Description

Download still frame from video as image. Or series of frames as images.
Level 3 (unless implemented in client). Output is a different type (image) which distinguishes this use case from the selection of a segment of video. Media fragment spec provides way to specify point in time / frame to extract.
For bonus points: create Image API service for the frame
put video identifier + region/frame as identifier of image API ?

Additional Background

Source: BL workshop notes

Build diagrams for describing scenarios

The use cases should have diagrams where possible to help explain the issue and proposed solution.
Valuable to do together at The Hague, if not before if time allows.

Diagrams to be managed in the repository, and linked from the individual issues. This will let us build a use case document as we progress from the issue descriptions and images.

Refer to a spatial bounding box of video (by area)

Description

Refer to a spatial bounding box of video (by area)

Proposed Solutions

xywh media fragments (same as for images)

Related IIIF: Canvas # Fragments

Additional Background

Source: BL workshop notes
Interest: 100%

Gapless playback of A/V -- serial delivery, multiple files for a single recording

Description

Implication: Assume default behavior is user actuation of control to step from canvas to canvas. Use continuous hint to change behavior. Note that need to queue up next canvases resources before finishing the first one (and audio processing issues to get nice continuation of one track to another). [Sample-accurate gapless playback is possible via Web Audio. There are libraries existing to help with this which a player can use.]

Note alternate option of putting multiple tracks on one canvas. Here the amount of overlap, gaplessness, or gaps would be determined by positions on the shared canvas.

Related IIIF: Sequence w/ e.g. viewingHint=”continuous”

Additional Background

Source: BL workshop notes
Interest: 100%

Access restrictions: prevent replay of part of a recording, e.g. redacting

Description

Access restrictions: prevent replay of part of a recording, e.g. redacting. By time code. On site vs external access. Geo-locking for out of copyright in UK but still in for US.

Variation(s)

Annotation of the restriction

Proposed Solutions

pre-constructed versions, with redirects. Not a client side thing, as clients could simply ignore it. So can't expect client to not play something it has access to.

Additional Background

IIIF: Auth
Source: BL workshop notes
Interest: 90%

Missing or lost scene

Description

Missing or lost scene. Virtual reconstruction. Blank time space that can be annotated.

Additional Background

Source: BL workshop notes
Interest: 40%

Increasing / decreasing audio delay

Description

Increasing / decreasing audio delay. Same with subtitles.
Could be variation of synchronization use cases.

Proposed Solutions

Q: Requires subframe accuracy?
Seems like a client feature to provide a control, or not.

Additional Background

Source: BL workshop notes
Interest: 50%

Request a snippet be created for later download

Description

As it's expensive to create snippets (either by time, region or both) it would be useful to request a snippet, and then be notified later that the content is available for download. This would only be for download, rather than for playback.

Proposed Solutions

Additional Background

From #4 via @glenrobson

Segmentation by time for download

Description

Client should be able to request an arbitrary segment of a resource by time for download.

This is distinct from:

segmentation by region
segmentation by time for playback via "streaming"

Proposed Solutions

Allow temporal dimensions by importing the media fragment spec (SMPTE) into the Media API.
See: https://www.w3.org/TR/media-frags/#naming-time

Thus the URL pattern might be something like:
identifier/t1,t2/xywh/size/?/default.avi

It has implications on an info.json equivalent, where pre-configured temporal "tiles" might be available for download, either by track/chapter/semantic section or simply by subdividing the content evenly.

Additional Background

Difficult to do on the fly, so likely a "level 3" feature

Source: BL workshop notes

Thumbnails of audio/video

Description

Thumbnails of audio/video -- image that represents the video, then sub-thumbnails for time based ranges. For video, and for ranges / different shots.

Variation(s)

Thumbnail as frame vs thumbnail as separate image. [related to summaries]
Animated gif as a thumbnail
Summaries with key-shots etc. [related to thumbnails]

Proposed Solutions

Use existing mechanisms for association of thumbnails with Manifest, Sequence, Canvas, Range etc. Question of how to deal with more structured sets of thumbnails for key-frames etc - use annotation list of images, might need paging for many images (e.g. every 5s for a few hours).

Additional Background

Source: BL workshop notes
Interest: 100%

Display rights / license / attribution along with content

Proposed Solutions

rights fields from image/prezi api

Additional Background

Source: BL workshop notes
Interest: 100%

Mock-up use of Selectors in Presentation API for video and audio transformations

Description

Based on the discussion at the Hague working group meeting, we discussed some potential solutions for requesting transformations to videos that mirrored the kinds of transformations that can be done with the Image API. The set of possible transformations for video (and even audio) is potentially very large. Before we begin trying to specify what an A/V bitstream API might look like and because there is a large number of videos currently delivered, it would be good to get a better sense of what is most important. We can begin experimenting with transformations and possible parameters before trying to specify a bitstream API by giving information to clients through the Presentation API. One proposed solution which appeared to be a productive direction is the use of Selectors for manipulations on a video or audio file.

We could use a mock-up of what selectors might look like for video and audio. The example(s) would be best if they're within the context of a full manifest.

Variation(s)

video
audio

Additional Background

Image API Selectors:
http://iiif.io/api/annex/openannotation/#iiif-image-api-selector

Draw on the video

Description

Draw on the video. - do it yourself football punditry

Proposed Solutions

Paint svg / image on to a timed segment of the video/canvas

Additional Background

Source: BL workshop notes

A/V Viewer facilities needed to meet accessibility requirements

Description

Viewer facilities needed to meet accessibility requirements. Eg, annotate descriptive audio tracks [Talk to DAISY about audio books]

Proposed Solutions

Viewing Hints (and other metadata) to make things more accessible?
See also WAI-ARIA https://www.w3.org/WAI/intro/aria .
? https://www.w3.org/WAI/WCAG20/quickref/#media-equiv ?
Ability to turn on/off multiple audio/video tracks for visual/auditory impaired (e.g. for Descriptive Video Service audio track).
Annotations with oa:describing motivation.

Additional Background

Source: BL workshop notes
Interest: 100%

Segmentation by region for download [video only]

Description

Be able to request an arbitrary region of the entire video by xywh, rather than by time. This could be used for:

Cropping out sidebars/letterboxes
Multi-screen display

This is distinct from:

Segmentation by time
Reference to region by Annotation

Implication is for download, and that the server will construct a full stand alone video representation.

Proposed Solutions

Incorporate region into the AV API, in the same way as for the Image API, using the Media Fragment syntax.

Has implications on an info.json equivalent for the AV API where different "tiles" of the video could be pre-generated up front. Would need to be distinct from the entire video at different resolutions.

Additional Background

Probably a "level 3" feature due to complexity.

Source: BL workshop notes
Interest: 100%

Zoom into videos

Description

Zoom into videos. CSI "enhance" feature or Bladerunner :) Perhaps presentation issue. Load high resolution version of video, rather than cropping and tiling of the video.

Additional Background

Source: BL workshop notes
Interest: 100%
Scope: Version >1

Set of content that includes audio (mp3) plus JPG plus PDF etc.

Description

Set of content that includes audio (mp3) plus JPG plus PDF etc. Might be scanned physical carriers. [related to show the cover etc]

Variation(s)

Virtual mixing board -- music along side a silent film
Multiple videos in a single display space. [related to multiscreen, virtual timeline]
Mash up of images (of sleeve) and audio of content. Same hierarchy as a book. including ,e.g., Image of the disk.

Proposed Solutions

Collection of manifests, either with or without multi-part. Or a manifest with sequence of timed / non-timed canvases

Additional Background

Source: BL workshop notes
Interest: 100%

Refer to a point or range in time of the content

Description

Ability to refer to a point in time. Standard syntax for addressing points. Want to say this chord at this point. SMPTE time codes. Might also need sample level access on audio files -- this 10 sample window. Frequency content of the window. SMPTE, Samples and Time. HTML5 video not great at supporting sample accuracy, get nearest key frame based on time. Could annotate with sample accuracy, but just works as metadata.

Variation(s)

References between multiple representations: image, audio, video, music notation.
As a student, I want to cite a particular time range of a video for my paper.
Segmentation of video, audio by time. Play range of time from video/audio
Variation: Limiting client to playing a certain range (e.g. avoid students being distracted by whole movie when you want them to see a segment for a course)

Additional Background

Related IIIF: Image API Region, info.json, # Fragment on Canvases
Media fragments support NPT, SMPTE (https://www.w3.org/TR/media-frags/#naming-time) and note no problem combining t=123.45&xywh=1,2,3,4 (https://www.w3.org/TR/media-frags/#processing-name-value-lists )
Media fragment spec does not specify mandatory or canonical order of parameters. If using in IIIF we’d probably want to mandate (or at least recommend a canonical order) to avoid the creation of duplicate URIs for the same thing.

Film folks need frame specific references. Can build a more advanced player that supports it. Frames would need to be relative to original scan of the film. Transcoding gets … "exciting".
e.g. freeze on a frame where the original physical film has a scratch. Instead of referring to a point in film by time, we can do it by frame as well

SMPTE allows reference to a frame in the format: hh:mm:ss:ff
eg: 00:01:30:02 is the third frame in 1 minute 30 seconds. (SMPTE spec is closed, but https://en.wikipedia.org/wiki/SMPTE_timecode describes)

Source: BL workshop notes
Interest: 100%

Use Cases

I have an A/V resources. I would like to say something about the first five and a half seconds of that resource.
I have an A/V resources. I would like to say something about what happens at exactly second 5.5
I have an A/V resources. During the first five and a half seconds of that resource I want to draw an annotation, perhaps on the same resource, perhaps another (such as a canvas).

Control of multi-channel audio

Description

Control of multi-channel audio, such as spatial information as to where the audio should come from. BL have a considerable number of ambisonic recordings (https://en.wikipedia.org/wiki/Ambisonics )

Proposed Solutions

Media-fragment spec has track fragment, no notion of channel. Could track be sanely used as channel? There appears to be explicit alignment between tracks in the webaudio spec and tracks in media-fragment spec so it would seem wrong to try to make a map to channels.
Level 3 server could separate them into individual streams.
Web Audio API supports arbitrary number of channels already. Could redirect/remix as necessary.
Might be able to work from video element (?)
http://webaudio.github.io/web-audio-api/#MediaElementAudioSourceNode
Requires suitable CORS headers on the source video
HTML5: https://www.w3.org/TR/html5/embedded-content-0.html#media-resources-with-multiple-media-tracks

Workaround is to encode channels as separate tracks and list them as such.

Additional Background

Source: BL workshop notes
Interest: 50%

Prototype info.jsons

Description

We would like to see some proposals for what folks think an info.json for A/V might look like so that we can make further decisions on what we need in an information package about a video or audio recording.

Variation(s)

video
audio

Proposed Solutions

Yes! Propose solutions. 😄

Additional Background

Notes from the Hague meeting that can help to inform potentially productive directions.
https://docs.google.com/document/d/1cnkOPm7rC9uKeSxorFpu004ZzJxolVLWex5NdxMKnHY/edit

Authenticated access to video/audio

Proposed Solutions

Scope: Auth API
Issue: some video servers require putting a token in the URL, rather than a cookie.

Additional Background

Source: BL workshop notes
Interest: 100%

Synchronization of video/audio stream and subtitles

Description

Requirements will vary depending on whether the subtitles are stored with a video/audio or whether they are separate (e.g. crowdsourced)

Additional Background

Related IIIF: Annotations (Layers and List), Canvas # Fragments

Source: BL workshop notes
Interest: 100%

Play multiple (synchronized) videos

Description

Play multiple (synchronized) videos. Multi-camera and mic recordings in parallel. For example: Use of multi-camera work for analysis of user behaviour. Or capture of slides and audio and video for a lecture.

Easy case of multiple video choice on a single canvas is easy (Choice). Harder case with multiple synchronized devices is the same as the previous use case.

Variation

Multiple: multi-screen presentation. Big screen for video, and separate screen for separate videos to be synchronized. Museum exhibits. e.g. world's fair. http://www.eamesoffice.com/the-work/think/ is the uber-AV use case - can we reproduce this? (there’s a 3D aspect to this too).
Difference: Presentation of the content. Multi camera might have a timecode to sync across different streams. Presentation mode should (hopefully) be consistent.
Requirement: Do not assume that all content is rendered by the same device.

Proposed Solutions

(any ideas about how your use case might be supported)

Additional Background

Source: BL workshop notes
Interest: 60%

Access to A/V via Mobile (responsive), or embedded on a page.

Description

Access via Mobile (responsive), or embedded on a page. Fully featured vs slimline player. Note that Apple require HLS for video over 10 minutes in iOS apps. [Not universally enforced but it’s always a concern.]

Proposed Solutions

Yes :) Different viewing applications can be built.

Additional Background

Source: BL workshop notes
Interest: 100%

Annotation of non API AV content

Description

Need to be able to control playback (volume, time, etc) of non AV-API content, such as Youtube.

Proposed Solutions

Add a service block to an AV resource in the Annotation that associates it with the canvas:

{
  "type": "Annotation",
  "target": "http://example.org/canvas/1",
  "body": {
    "id": "http://youtube.com/video",
    "service": {
      "id": "http://youtube.com/video/api",
      "profile": "uri-for-youtube-api",
      "more": ["params", "go","here", "too"]
    }
  }
}

Additional Background

Expands on #2