GithubHelp home page GithubHelp logo

readium / webpub-manifest Goto Github PK

View Code? Open in Web Editor NEW
87.0 44.0 23.0 1.98 MB

📜 A JSON based Web Publication Manifest format used at the core of the Readium project

License: BSD 3-Clause "New" or "Revised" License

HTML 100.00%
epub specification json json-ld publications ebooks webpub manifest readium audiobooks

webpub-manifest's Introduction

Readium Web Publication Manifest

The Readium Web Publication Manifest is a JSON-based document meant to represent and distribute publications over HTTPS.

It is the primary exchange format used in the Readium Architecture and serves as the main building block for OPDS 2.0.

Editors:

  • Hadrien Gardeur

Participate:

Table of Contents

Example

{
  "@context": "https://readium.org/webpub-manifest/context.jsonld",
  
  "metadata": {
    "@type": "http://schema.org/Book",
    "title": "Moby-Dick",
    "author": "Herman Melville",
    "identifier": "urn:isbn:978031600000X",
    "language": "en",
    "modified": "2015-09-29T17:00:00Z"
  },

  "links": [
    {"rel": "self", "href": "https://example.com/manifest.json", "type": "application/webpub+json"},
    {"rel": "alternate", "href": "https://example.com/publication.epub", "type": "application/epub+zip"},
    {"rel": "search", "href": "https://example.com/search{?query}", "type": "text/html", "templated": true}
  ],
  
  "readingOrder": [
    {"href": "https://example.com/c001.html", "type": "text/html", "title": "Chapter 1"}, 
    {"href": "https://example.com/c002.html", "type": "text/html", "title": "Chapter 2"}
  ],

  "resources": [
    {"rel": "cover", "href": "https://example.com/cover.jpg", "type": "image/jpeg", "height": 600, "width": 400},
    {"href": "https://example.com/style.css", "type": "text/css"}, 
    {"href": "https://example.com/whale.jpg", "type": "image/jpeg"}, 
    {"href": "https://example.com/boat.svg", "type": "image/svg+xml"}, 
    {"href": "https://example.com/notes.html", "type": "text/html"}
  ]
}

1. Introduction

1.1. Goals

While the Web is the largest collection of interlinked documents ever created, it lacks a mechanism for expressing how a collection of resources, when grouped together can represent a publication.

Publication formats such as EPUB or CBZ/CBR group these documents together using a container format, making them easier to archive or transmit as a whole. But they also break an important promise of the Web: the resources of a publication are not available through HTTP to any client that would like to access them.

W3C has recently provided a definition for a Web Publication:

A Web Publication (WP) is a collection of one or more constituent resources, organized together in a uniquely identifiable grouping, and presented using standard Open Web Platform technologies.

It also provides a definition for a manifest in the context of a Web Publication:

[...] manifest refers to an abstract means to contain information necessary to the proper management, rendering, and so on, of a publication. This is opposed to metadata that contains information on the content of the publication like author, publication date, and so on. The precise format of how such a manifest is stored is not considered in this document.

The Readium Web Publication Manifest is an attempt to standardize a JSON based manifest format that follows both definitions.

To facilitate the interoperability between EPUB and Web Publications, this document also defines a number of extension points to fully support EPUB specific features.

1.2. Terminology

Collection
A grouping of some variable number of data items. In the context of this specification, a collection is defined as a grouping of metadata, links and sub-collections.
Full Collection
A collection that contains at least two or more data items among metadata, links and sub-collections.
Compact Collection
A collection that contains one or more links, but doesn't contain any metadata or sub-collections.
Manifest
A manifest is a full collection that represents structured information about a publication.

1.3. Abstract Model

The Readium Web Publication Manifest is based on a hypermedia model inspired by Atom, HAL, Siren and Collection+JSON among others.

Every Readium Web Publication Manifest is a collection that must contain:

2. Syntax

2.1. Sub-Collections

This specification defines two collection roles that are the building blocks of any manifest:

Role Definition Compact Collection? Required?
readingOrder Identifies a list of resources in reading order for the publication. Yes Yes
resources Identifies resources that are necessary for rendering the publication. Yes No

Both collections are compact collections, which means that they contain one or more Link Objects.

All additional collection roles are defined in the Collection Roles registry.

Extensions that are not registered in the registry must use a URI for their role.

A manifest must contain a readingOrder sub-collection.

Other resources that are required to render the publication should be listed in resources.

All resources listed in readingOrder and resources must indicate their media type using type.

Example 1: Reading order and list of resources

{
  "readingOrder": [
    {"href": "/chapter1", "type": "text/html"},
    {"href": "/chapter2", "type": "text/html"}
  ],
  "resources": [
    {"href": "/style.css", "type": "text/css"},
    {"href": "/image1.jpg", "type": "image/jpeg"}
  ]
}

2.2. Metadata

JSON-LD provides an easy and standard way to extend existing JSON document: through the addition of a context, we can associate keys in a document to Linked Data elements from various vocabularies.

The Web Publication Manifest relies on JSON-LD to provide a context for the metadata object of the manifest.

While JSON-LD is very flexible and allows the context to be defined in-line (local context) or referenced (URI), the Web Publication Manifest restricts context definition strictly to references (URIs) at the top-level of the document.

The Web Publication Manifest defines an initial registry of well-known context documents, which currently includes the following references:

Name URI Description Required?
Default Context https://readium.org/webpub-manifest/context.jsonld Default context definition used in every Web Publication Manifest. Yes

Context documents are all defined and listed in the Context Documents registry.

The Readium Web Publication Manifest has a single requirement in terms of metadata: all publications must include a title.

In addition all publications should include a @type key to describe the nature of the publication.

Example 2: Minimal metadata

"metadata": {
  "@type": "http://schema.org/Book",
  "title": "Test Publication"
}

2.3. Links

Links are expressed using the links key that contains one or more Link Objects.

A manifest must contain at least one link using the self relationship where href is an absolute URI to the canonical location of the manifest.

Example 3: Link to the canonical location of a manifest

"links": [
  {
    "rel": "self",
    "href": "http://example.org/manifest.json",
    "type": "application/webpub+json"
  }
]

A manifest may also contain other links, such as a alternate link to an EPUB 3.2 version of the publication for example.

Link relations that are currently used in this specification and its extensions are listed in the Link Relations registry.

2.4. The Link Object

The Link Object is a core component of the Readium Web Publication Manifest JSON syntax.

It represents a link to a resource along with a set of metadata associated with that resource.

This specification defines the following keys for this JSON object:

Key Definition Format Required?
href URI or URI template of the linked resource URI or URI template Yes
templated Indicates that href is a URI template Boolean, defaults to false Only when href is a URI template
type Media type of the linked resource MIME Media Type No
title Title of the linked resource String No
rel Relation between the resource and its containing collection One or more Link Relations No
properties Properties associated to the linked resource Properties Object No
height Height of the linked resource in pixels Integer No
width Width of the linked resource in pixels Integer No
duration Duration of the linked resource in seconds Float No
bitrate Bit rate of the linked resource in kilobits per second Float No
language Expected language of the linked resource One or more BCP 47 Language Tag No
alternate Alternate resources for the linked resource One or more Link Objects No
children Resources that are children of the linked resource, in the context of a given collection role One or more Link Objects No

3. Resources in the Reading Order

The readingOrder of a manifest may contain references to any text, image, video or audio resource that can be opened in a Web browser.

4. Media Type

This specification introduces a dedicated media type value to identify the Readium Web Publication Manifest: application/webpub+json.

All HTTP responses for the manifest must indicate this media type in their headers.

5. Discovering a Manifest

The Readium Web Publication Manifest may be referenced by resources included in its readingOrder or resources using a link.

Such links must include:

  • application/webpub+json as the media type of the manifest
  • manifest as the relation of the link

Example 4: Link in HTML to a manifest

<link href="manifest.json" rel="manifest" type="application/webpub+json">

Example 5: Link in HTTP headers to a manifest

Link: <http://example.org/manifest.json>; rel="manifest";
         type="application/webpub+json"

6. Table of Contents

A Readium Web Publication Manifest may contain a reference to a table of contents.

In order to represent a table of contents in the manifest, this specification introduces an additional collection role:

Role Definition Compact Collection? Required?
toc Identifies the collection that contains a table of contents. Yes No

Example 6: Partial TOC for an audiobook

"toc": [
  {
    "href": "track1.mp3#t=71",
    "title": "Part 1 - This World",
    "children": [
      {
        "href": "track1.mp3#t=80",
        "title": "Section 1 - Of the Nature of Flatland"
      },
      {
        "href": "track1.mp3#t=415",
        "title": "Section 2 - Of the Climate and Houses in Flatland"
      },
      {
        "href": "track1.mp3#t=789",
        "title": "Section 3 - Concerning the Inhabitants of Flatland"
      }
    ]
  }
]

As a fallback mechanism, a Readium Web Publication Manifest may identify an HTML or XHTML resource in readingOrder or resources as a table of contents using the contents link relation.

Example 7: Reference to an HTML resource containing a TOC

{
  "rel": "contents", 
  "href": "contents.html", 
  "type": "text/html"
}

A User Agent may also rely on the title key included in each Link Object of the readingOrder to extract a minimal table of contents.

The EPUB profile also defines additional collection roles for embedding navigation directly in the manifest.

7. Cover

A Readium Web Publication Manifest may contain a reference to a cover.

Link Objects in readingOrder, resources or links can be identified as such using the cover link relation.

All Link Objects containing the cover link relation must reference an image directly. They should include a height and width to facilitate how they are processed by User Agents.

This specification recommends using one of the following media types: image/jpeg, image/png, image/gif, image/webp or image/svg+xml.

Example 8: Reference to a cover

{
  "rel": "cover", 
  "href": "cover.jpg", 
  "type": "image/jpeg", 
  "height": 600, 
  "width": 400
}

8. Extensibility

The manifest provides multiple extension points:

In addition to these extension points, this specification defines both a profile registry and a module registry as well.

The initial registry, contains the following profiles:

Name Description
EPUB Profile A profile for EPUB content transformed to Web Publications.
Audiobook Profile A profile for Audiobooks.
Divina Profile A profile for Digital Visual Narrative publications (comics, manga and bandes dessinées).
PDF Profile A profile for PDF documents integrated into Web Publications.

9. Packaging

A Readium Web Publication may be distributed unpackaged on the Web, but it may also be packaged for easy distribution as a single file. To achieve this goal, this specification defines the Readium Packaging Format (RPF).

Appendix A. JSON Schema

A JSON Schema is available under version control at https://github.com/readium/webpub-manifest/tree/master/schema

For the purpose of validating a Readium Web Publication Manifest, use the following JSON Schema resource: https://readium.org/webpub-manifest/schema/publication.schema.json

webpub-manifest's People

Contributors

asm0dey avatar danielweck avatar dwhodges2 avatar franklefebvre avatar gkostin1966 avatar hadriengardeur avatar jaypanoz avatar jccr avatar llemeurfr avatar qnga avatar vbessonov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

webpub-manifest's Issues

TOC entries with no `href` just `title`

When looking at data from the EPUB context: The EPUB spec allows you to have navigation document headings be "link-less".

For example:

<ol><li>
    <span>Appendix</span>
    <ol>
      <li>
        <a href="appendix.xhtml#a.1-birds">A.1 Birds</a>
      </li>
      <li>
        <a href="appendix.xhtml#a.2-turtles">A.2 Turtles</a>
      </li>
    </ol>
  </li>
</ol>

How would someone represent this data as a RWPM ToC compact subcollection?
Given that the items are Link objects that require an href attribute.

My only idea right now is to have the href value be a blank string.

The resulting collection would look like this:

"toc": [
  {
    "href": "",
    "title": "Appendix",
    "children": [
      {
        "href": "appendix.xhtml#a.1-birds",
        "title": "A.1 Birds"
      },
      {
        "href": "appendix.xhtml#a.2-turtles",
        "title": "A.2 Turtles"
      }
    ]
  }
]

Readium WebPub manifest instance links to JSON-LD context, what about JSON-Schema?

Should the Readium2 streamer implementations produce an additional link to reference the JSON-Schema?
For example:
https://github.com/readium/webpub-manifest/blob/master/README.md#example
...links via @context to the JSON-LD URI:
https://readium.org/webpub-manifest/context.jsonld
...but we could also use the recommended type and rel for JSON-Schema draft-v7:
https://json-schema.org/draft-07/json-schema-release-notes.html#linking-instances-and-schemas

"self" when packaging a webpub

If delivering a publication as a webpub package (as described here: https://github.com/readium/webpub-manifest#8-package), what should the "self" link be set to? "A manifest must contain at least one link using the self relationship where href is an absolute URI to the canonical location of the manifest", so it cannot be left out (and I'm fine with that), but if it served from say an ephemeral URL (for security reasons), and is on the user's device, what is its self href?

JSON Schema: extensions/epub/properties.schema.json "spread" missing "portrait"?

"spread": {
"description": "Indicates the condition to be met for the linked resource to be rendered within a synthetic spread",
"type": "string",
"enum": [
"auto",
"both",
"none",
"landscape"
]
},

"spread": {
"description": "Indicates the condition to be met for the linked resource to be rendered within a synthetic spread",
"type": "string",
"enum": [
"auto",
"both",
"none",
"landscape"
]
}

archive.org audio-book webpub manifest, incorrect bitrate JSON type

I'm not sure where to file this issue, @HadrienGardeur could you please advise? The reason I am filing this here is because I am in the process of extending the r2-xxx-js unit tests to include the archive.org OPDS1 feed + (audio) webpubs (in addition to the OPDS2 Feedbooks feed), and this is how I discovered the bitrate JSON error (I am currently using a workaround to eliminate false negatives).

bitrate is a number, not a string:

"bitrate": {
"description": "Bitrate of the linked resource in kbps",
"type": "number",
"exclusiveMinimum": 0
},

Incorrect:

https://api.archivelab.org/books/4thyearanticipations_1812_librivox/opds_audio_manifest

"readingOrder": [
{
"bitrate": "113",
"href": "http://archive.org/download/4thyearanticipations_1812_librivox/4thyearanticipations_00_wells.mp3",
"title": "00 - Preface",
"duration": 678,
"type": "audio/mpeg"
}
]

Redistributing the RWPM information

Currently the RWPM info is distributed as

  • a main page which gives core info (example, introduction, sub-collections, metadata, links, ToC, cover, extensibility (links to extensions), discovery from its resources, packaging in EPUB, media type)
  • some registries in the root directory (properties of a linked object, relationships, roles)
  • a "contexts" section which contains the definition of metadata properties in the default context.
  • an "extensions" section which contains a mix of profiles (Audiobooks, EPUB, DiViNa) and shared modules (presentation hints).
  • a "schema" section.

My main concern is about the "extensions" section. As we'd like to create new shared modules for transitions, encryption and so forth, we cannot continue mixing that with profiles like Audiobooks and DiViNa.

First proposal is therefore to split "extensions" into 2 sections; "profiles" and "modules".
In "profiles" we'll find EPUB, Audiobooks, DiViNa. PDF, Video could be added later. Profiles may have "levels" (DiViNa has more than 3).
In "modules" we'll find Presentation Hints, Transitions, Encryption ... Some sections of the core page may also become separate modules (ToC ?)

Second proposal is to change the label "Default context" in the core page, metadata section, to something more readable (implementers don't care about JSON-LD). Will there be several JSON-LD contexts? I doubt it.

Third proposal is to better highlight profiles in the core page: they are currently lost in the center of the page.

TOC may contain entries with text but no link

Metadata links (EPUB parsing)

How should we parse <link rel="cc:license" href="http://creativecommons.org/licenses/by-sa/3.0/"/> (for example).

What is the JSON Schema equivalent?
https://github.com/readium/webpub-manifest/blob/master/schema/metadata.schema.json

Complete OPF example (note the <link rel="cc:license" refines="#cover" ... />):
https://github.com/IDPF/epub3-samples/blob/master/30/wasteland/EPUB/wasteland.opf

<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" version="3.0" unique-identifier="uid" xml:lang="en-US" prefix="cc: http://creativecommons.org/ns#">
    <metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
        <dc:identifier id="uid">code.google.com.epub-samples.wasteland-basic</dc:identifier>
        <dc:title>The Waste Land</dc:title>
        <dc:creator>T.S. Eliot</dc:creator>
        <dc:language>en-US</dc:language>
        <dc:date>2011-09-01</dc:date>
        <meta property="dcterms:modified">2012-01-18T12:47:00Z</meta>
        <!-- rights expressions for the work as a whole -->
        <dc:rights>This work is shared with the public using the Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license.</dc:rights>        
        <link rel="cc:license" href="http://creativecommons.org/licenses/by-sa/3.0/"/>
        <meta property="cc:attributionURL">http://code.google.com/p/epub-samples/</meta>
        <!-- rights expression for the cover image -->       
        <link rel="cc:license" refines="#cover" href="http://creativecommons.org/licenses/by-sa/3.0/" />
        <link rel="cc:attributionURL" refines="#cover" href="http://en.wikipedia.org/wiki/Simon_Fieldhouse" />        
        <!-- cover meta element included for 2.0 reading system compatibility: -->
        <meta name="cover" content="cover"/>
    </metadata> 
    <manifest>
        <item id="t1" href="wasteland-content.xhtml" media-type="application/xhtml+xml" />
        <item id="nav" href="wasteland-nav.xhtml" properties="nav" media-type="application/xhtml+xml" />
        <item id="cover" href="wasteland-cover.jpg" media-type="image/jpeg" properties="cover-image" />
        <item id="css" href="wasteland.css" media-type="text/css" />
        <item id="css-night" href="wasteland-night.css" media-type="text/css" />
        <!-- ncx included for 2.0 reading system compatibility: -->
        <item id="ncx" href="wasteland.ncx" media-type="application/x-dtbncx+xml" />
    </manifest>
    <spine toc="ncx">
        <itemref idref="t1" />        
    </spine>    
</package>

Related issue: readium/r2-shared-js#16

Add a ToC section

A ToC (Table of Contents) section has been added in the Readium-2 software, but there is no mention of it in this specification.

Its source is an EPUB2 NCX file or an EPUB3 Navigation document.

Text direction for literals

In our current version of the manifest, we don't have the ability to indicate the direction of literals.

There has been many discussions in the past about this within W3C, but it seems that things are finally moving in the right direction: w3c/pub-manifest#75

If JSON-LD 1.1 introduces @direction:

  • this will become immediately usable for us in metadata
  • we'll need to explore how @direction behaves in language maps, which we use extensively for indicating language variants of a given literal
  • the use of @value and @direction instead of a literal could affect how metadata are parsed and processed, so we'll need to be extra careful about that

Since we're already using JSON-LD 1.1, we won't need to bump our version.

EPUB equivalent of "letterer"

The webpub schema has a "letterer", "inker", and "penciler" field in the metadata, seen here: https://github.com/readium/webpub-manifest/blob/master/schema/metadata.schema.json#L96 . What are the equivalent contributor role MARC codes? I do not see them implemented in the two most up-to-date streamers:
https://github.com/readium/r2-streamer-swift/blob/develop/Sources/parser/EPUB/MetadataParser.swift#L260
https://github.com/readium/r2-streamer-kotlin/blob/develop/r2-streamer/src/main/java/org/readium/r2/streamer/parser/epub/MetadataParser.kt#L132

I ask because I am writing an application that is generating an EPUB3 based on WebPub metadata, and have a "letterer"

consider adding multilanguage support to sort_as field

Currently, the author field supports multiple languages, but the sort_as subfield does not:
"author": [{
"name": {
"ru": "Эдвин Эбботт Эбботт",
"en": "Edwin Abbott Abbott"
},
"sort_as": "Abbott, Edwin Abbott"
}],

I can see uses for adding pre-calculated sorting-formatted names in multiple languages to book manifests.

EPUB parsing problem with language maps for title strings

Concrete example:
https://github.com/IDPF/epub3-samples/blob/0b3e80553f509fbdfe591008cd7d1e804b24db54/30/regime-anticancer-arabic/EPUB/package.opf#L8-L11

Because ar (Arabic) is already used by the meta refines, the default dc:title expressed in French cannot also take the ar key in the language map (JSON object), as per the dc:language.

The current buggy r2-shared-js implementation generates (note the missing French dc:title):

        "title": {
            "ar": "السرطان من  للوقاية الصحيح الغذائي  النظام"
        },

I have fixed this bug by generating a ficticious language code placeholder / object key _:

        "title": {
            "_": "Le Vrai Régime anti-cancer",
            "ar": "السرطان من  للوقاية الصحيح الغذائي  النظام"
        },

Any other suggestions?

@JayPanoz is this included in the parser doc? (sorry, I cannot find the link anymore)

JSON Schema:

"title": {
"anyOf": [
{
"type": "string"
},
{
"description": "The language in a language map must be a valid BCP 47 tag.",
"type": "object",
"patternProperties": {
"^((?<grandfathered>(en-GB-oed|i-ami|i-bnn|i-default|i-enochian|i-hak|i-klingon|i-lux|i-mingo|i-navajo|i-pwn|i-tao|i-tay|i-tsu|sgn-BE-FR|sgn-BE-NL|sgn-CH-DE)|(art-lojban|cel-gaulish|no-bok|no-nyn|zh-guoyu|zh-hakka|zh-min|zh-min-nan|zh-xiang))|((?<language>([A-Za-z]{2,3}(-(?<extlang>[A-Za-z]{3}(-[A-Za-z]{3}){0,2}))?)|[A-Za-z]{4}|[A-Za-z]{5,8})(-(?<script>[A-Za-z]{4}))?(-(?<region>[A-Za-z]{2}|[0-9]{3}))?(-(?<variant>[A-Za-z0-9]{5,8}|[0-9][A-Za-z0-9]{3}))*(-(?<extension>[0-9A-WY-Za-wy-z](-[A-Za-z0-9]{2,8})+))*(-(?<privateUse>x(-[A-Za-z0-9]{1,8})+))?)|(?<privateUse2>x(-[A-Za-z0-9]{1,8})+))$": {
"type": "string"
}
},
"additionalProperties": false,
"minProperties": 1
}
]
},
"subtitle": {
"anyOf": [
{
"type": "string"
},
{
"description": "The language in a language map must be a valid BCP 47 tag.",
"type": "object",
"patternProperties": {
"^((?<grandfathered>(en-GB-oed|i-ami|i-bnn|i-default|i-enochian|i-hak|i-klingon|i-lux|i-mingo|i-navajo|i-pwn|i-tao|i-tay|i-tsu|sgn-BE-FR|sgn-BE-NL|sgn-CH-DE)|(art-lojban|cel-gaulish|no-bok|no-nyn|zh-guoyu|zh-hakka|zh-min|zh-min-nan|zh-xiang))|((?<language>([A-Za-z]{2,3}(-(?<extlang>[A-Za-z]{3}(-[A-Za-z]{3}){0,2}))?)|[A-Za-z]{4}|[A-Za-z]{5,8})(-(?<script>[A-Za-z]{4}))?(-(?<region>[A-Za-z]{2}|[0-9]{3}))?(-(?<variant>[A-Za-z0-9]{5,8}|[0-9][A-Za-z0-9]{3}))*(-(?<extension>[0-9A-WY-Za-wy-z](-[A-Za-z0-9]{2,8})+))*(-(?<privateUse>x(-[A-Za-z0-9]{1,8})+))?)|(?<privateUse2>x(-[A-Za-z0-9]{1,8})+))$": {
"type": "string"
}
},
"additionalProperties": false,
"minProperties": 1
}
]
},

Host JSON-LD context documents on readium.org

The Readium Web Publication Manifest is based on JSON-LD and has currently two different external documents both pointing to the readium.org website.
These URIs currently return a 404.

Could we add a few routes and static documents to readium.org?

cc @rkwright @whmccoy

Should a Web Publication be styleable?

I'm spawning this issue from this conversation about non-linear resources.

Currently, only reflowable EPUB supports overriding CSS styles in Readium. We assumed that Web Publications can't be styled appropriately, since layouts are much more diverse on the web. However, maybe there's a need for book-like publications with WebPub:

  • RWPM is much more lightweight than EPUB to author.
  • Web Publications can be hosted remotely.

There's also some interest:

I am converting some HTML files to hopefully be read by Readium. It would be easier to read and write a single JSON file rather than the several XML files that are required for a valid epub
https://readium.slack.com/archives/C0AN4MN1J/p1585608890006400

Pagination could already be handled with the paginated overflow in the Presentation Hints. However, we don't have any way to authorize CSS styling in the RWPM.

A few questions remain:

  • Should CSS styling be implicitly disabled? This makes sense for packaged websites.
  • How does this impact or enhance accessibility?
  • Which settings granularity should we support?
    • Should the author enable each individual setting (font size, colors, text alignment, etc.)?
    • Or should we define broad rendering-type enums to define what kind of customization makes sense (webpage, book, comic, etc.)?

Divina compliance - "transitions"

In the divina extension document, under level 1 compliance, "Support for transitions" is a requirement.

What are "transitions"? It isn't hyperlinked, Ctrl + F lead to no results, and it is the only occurrence in the repository according to GitHub's search.

DiViNa and accessibility

I'll scope this to the DiViNa format but it may be generalized.

For making DiViNa publications accessible, I'm thinking about the following use cases:

a/ the comic is a turbomedia. Each image can be accompanied by a text or audio. This text can then be read in synthesized voice by the reading app, audio can be played directly. The audio starts when the user reaches the image and stops when the user moves to another image.
b/ the comic is a webtoon. The audio (synthesized or not) starts and stops at certain visual points in the continuous visual narrative.
c/ the comic is a traditional board with guided navigation (each box and even each bubble can be isolated by a rectangular shape, placed in sequence). Text or audio is associated with each rectangular box defined by the guided navigation. The audio start when the user reaches the box and stops when the user moves to another box.

Do you foresee other useful use cases?

Notes:
We've defined guided navigation as a collection of {href, title}, out of the reading order.

In parallel, we'll reuse in webpub the sync-narration defined by the W3C, which associates a recursive narration json object made of {text, audio}, where text isolates a segment of html and audio a segment of audio in any resource of the publication.

None of these structures is currently able to fulfill these needs.

EPUB Module: Landmarks

How should someone try to understand Landmarks in the model?

(refresher: http://kb.daisy.org/publishing/docs/navigation/landmarks.html)

Two ideas:

1. Should we map epub:type enum values into the definition ofLinks in the landmarks (and friends) subcollections?

It could be that we use rel for epub:type values.. Here's my proposed solution:

  "landmarks": [
    {
      "rel": "cover",
      "href": "cover.xhtml",
      "title": "Cover"
    },
    {
      "rel": "titlepage",
      "href": "copyright.xhtml#title",
      "title": "Title Page"
    },
    {
      "rel": "toc",
      "href": "tox.xhtml#TOC",
      "title": "Table Of Contents"
    },
    {
      "rel": "bodymatter",
      "href": "chapter1.xhtml",
      "title": "Start Of Content"
    }

Note that we already have cover as a rel: https://readium.org/webpub-manifest/relationships.html#:~:text=cover

2. Should we just point to the landmarks nav element in the model, and the processing happens outside of the model?

How would that look like? I believe this is how I understood your alternative, @HadrienGardeur.

Use Cases

The DAISY Knowledgebase article highlights this use case:

Not only does the landmarks nav simplify access to major sections of the publication, without having the navigate the entire table of contents, but it also facilitates user agent behaviors. A device that gives the option to automatically open to the first page of the body, for example, or provides quick links to the index or glossary can make use of the extra semantics in the landmarks nav to this end.

My use case is that I want a model (implemented in code) to read this information and expose it as a Publication model instance. My users are consumers of this, let's say via an API, which could be using this information to understand the major sections of a book.

Publish a note on accessibility

As part of the review of our work, we need to publish a note on accessibility that will cover how various elements from schema.org can be used for this purpose.

This note should also cover:

  • recommendations for title (and potentially description) in the readingOrder and guided (for DiViNa)
  • links to accessibility reports
  • mapping from EPUB 3.x to RWPM

Description for Link Objects

Can we set the description (not title) for each audio track?

Currently we can set the description for audiobook in metadata element, but not for track.

Readium WebPub manifest JSON-LD and JSON-Schema, URI/URL stability, versioning / revisioning

In the "Readium desktop" app, a backend database is used to store WebPubManifest and OPDS models, so @clebeaupin has been wondering about how to detect potentially-breaking changes (such as spine vs. readingOrder) using URI comparisons to differentiate model revisions (no need for semantic versioning overkill, unless we envision many updates in the model definitions)

Right now, how "finalized" is the JSON-LD context? I imagine it is pretty stable now, but I am wondering about "multilingual strings" etc. (in reference to the design details that are moving targets in W3C Web Publications)
Same question about JSON-Schema.

Related issue: #10

Improve JSON Schema

Our first draft for our JSON Schema is missing the following things:

  • Validate the presence of at least one self link
  • Validate that templated is set to true when a URI template is used in a link object
  • Validate the presence of type for readingOrder and resources items
  • Validation for languages (in language maps or in language)
  • Validate extensibility based on collection model
  • Validation for properties
  • Validation for subject

Remove roles and MARC relators from contributor

This is a placeholder while I'm working on a full proposal.

We're considering deprecating the role element of the contributor element in favor of:

  • using other schema.org elements instead
  • using our extensibility (URIs)

Sorting keys are language dependent

Currently RWPM supports sortAs in subjects, titles and contributors independently of their localized names. But sorting key is in fact language-dependent and should be supported as such.

I think, for example title, should be used as follows:

title: {
  "en": "Around the World in Eighty Days",
  "fr":  {
     "name": "Le Tour du monde en quatre-vingts jours",
     "sortAs": "Tour du monde en quatre-vingts jours"
  }
}

Or

title: {
  "name": "Le Tour du monde en quatre-vingts jours",
  "sortAs": "Tour du monde en quatre-vingts jours, Le"
}

Or

title: {
  "name": "Around the World in Eighty Days"
}

In Kotlin app, everything is ready for that. We have an object LocalizedString that contains objects Translation which may contain a sorting key besides canonical string.

[Divina] Have Link Object be the default format (rather than a URI) each time a file is expected

At this stage, the transitions module lists 2 Transition Object keys whose values can contain URIs: file (for a video transition) and sequence (for a sequence transition) - file is a URI while sequence is an array of URIs.

To be consistent with the rest of the specifications, and since providing additional information like alternate resources or a fit value could be useful, I'd recommend changing the expected format to Link Object for file, and array of Link Objects for sequence.

(Note that the alternate array is itself an array of Link Objects)

JSON Schema - please clarify usage of publication subcollection

Hello @HadrienGardeur could you please provide an example of how the recursive additionalProperties / subcollection.schema.json gets "expanded" into a publication object? I am having a hard time translating this part of the schema into workable (TypeScript) code, especially with the recursively-introduced metadata and links data fields, and the array-of-objects/single-object/array-of-link "union type" design approach which creates several possible structural variants in the data model. Many thanks!

"additionalProperties": {
"$ref": "subcollection.schema.json"
},

{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "https://readium.org/webpub-manifest/schema/subcollection.schema.json",
"title": "Subcollection",
"anyOf": [
{
"type": "object",
"properties": {
"metadata": {
"type": "object"
},
"links": {
"type": "array",
"items": {
"$ref": "link.schema.json"
}
},
"additionalProperties": {
"$ref": "subcollection.schema.json"
}
},
"required": [
"metadata",
"links"
]
},
{
"type": "array",
"items": {
"anyOf": [
{
"$ref": "link.schema.json"
},
{
"properties": {
"metadata": {
"type": "object"
},
"links": {
"type": "array",
"items": {
"$ref": "link.schema.json"
}
}
},
"additionalProperties": {
"$ref": "subcollection.schema.json"
},
"required": [
"metadata",
"links"
]
}
]
}
}
]
}

JSON Schema - metadata.subject?

metadata.subject[].(name|code|scheme|sortAs)

Currently parsed from EPUB (OPF XML):

metadata.subject.name = OPF /package/metadata/subject/text()
metadata.subject.code = OPF /package/metadata/subject/@term
metadata.subject.scheme = OPF /package/metadata/subject/@authority

Adding information about the profile in the manifest

Currently, RWPM doesn't contain any information about its profile (e.g. Audiobook, DiViNa...), but we need this information for example to figure out which Navigator to use.

At the moment, we're using different heuristics per profiles:

  • Audiobook: has http://schema.org/Audiobook for its @type property, or contains only resources with an audio type in its reading order.
  • DiViNa: contains only resources with a bitmap type in its reading order.
  • LCPDF: contains only resources with a PDF type in its reading order.

This can be source of errors, and it would be useful to have the profile indicated in the manifest itself instead. Any thoughts on how it could look like? A conformsTo property, for example.

Handling encrypted content

In EPUB, encrypted content is declared in encryption.xml with the following information (as of EPUB 3.1):

  • algorithm used to encrypt the resource
  • how the resource is stored in the ZIP container (Store or deflate)
  • original size of the resource

In addition to these information, other relevant info are missing:

  • identifier for a specific DRM scheme
  • profile for a specific DRM (very useful for LCP)

In this new manifest, using the Properties Object could be a good fit to provide that info.

[Divina] A more intuitive order for listing presentation hints

I recommend changing the order in which presentation hints are, well, presented... since this could facilitate understanding, in my opinion:

  • continuous (-> high level, linked to story type)
  • fit (-> technically the first operation applied to a given resource: "try to make it fit in the viewport according to its stated fit")
  • overflow (-> "once the resource fits in the viewport, if parts of it lie outside the viewport, decide whether those parts should be made accessible - through scroll or pagination - or not")
  • clipped (-> "you know what, actually clip (or not) all resource parts outside the viewport, whatever the overflow")
  • spread (-> "allow a double-page reading mode")
  • orientation (-> "block reading in only one orientation or not")

Going one step further, I recommend that the properties of a Link Object that are relevant to the divina case be presented in an order consistent with the former:

  • fit (overrides the story fit)
  • clipped (overrides the story clipped)
  • page (whether the resource is a left, center or right page in a double page)

JSON Schema - metadata.rendition?

metadata.rendition.(layout|orientation|overflow)

Currently parsed from EPUB (OPF XML):

metadata.rendition.layout = OPF /package/metadata/meta/ @property ("rendition:layout") + text() ("pre-paginated" becomes "fixed", but otherwise "reflowable" remains)

metadata.rendition.orientation = OPF /package/metadata/meta/ @property ("rendition:orientation") + text()

metadata.rendition.overflow = OPF /package/metadata/meta/ @property ("rendition:flow") + text()

Default language for.. say metadata.title?

Given I have a publication with a title in metadata like this:

{
  "metadata": {
    "title": {
      "fr": "Vingt mille lieues sous les mers",
      "en": "Twenty Thousand Leagues Under the Sea",
      "ja": "海底二万里"
    }
  }
}

What would the default language be? If all I want is just any string, without having a localization preference. Would it be the first in the "list", i.e. the value of "fr"?

If so.. the order of the keys might be a problem.

Direction of members

Our current model lacks a way that we can express the direction of our metadata members.

Ideally we would need the ability to:

  • set a default direction, that applies to all metadata members
  • set a direction per member

It's worth noting that this has a pretty big impact on our model:

  • we're using JSON-LD and there's no explicit support for this
  • RDF has no support for direction as well
  • we won't be able to use a language map anymore if we add support for direction per member

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.