GithubHelp home page GithubHelp logo

podcast-namespace's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

podcast-namespace's Issues

max string lengths

I'm putting the DB schema into Podcastindex right now to support these tags and it's clear we need to define string lengths in the tag definitions for those that may be free form. The ones like URL's and email addresses are fine. The others, like <podcast:location> locale string need some guidance. I will start adding some suggested max lengths. If anyone else sees a need somewhere please do that too.

<podcast:host> disambiguation

podcast:host[A person's name]</podcast:host>

In the UK, there's a very famous radio presenter (who does a podcast) called Chris Evans.
In the US, there's a very famous actor (who does a podcast) called Chris Evans.

These are not the same person.

The schema entry for a person is complicated, but allows for additional information like images, links to Wikipedia or Twitter (which are both unique IDs of a sort), and other elements.

Can/should a podcast:host be a collection of different attributes about a person? It seems that just using a name might be a little less good here. URLs and images, perhaps?

Secondly - this could well be a spammy entry, and we might need to consider how we stop every podcast claiming that they have Joe Rogan as guests every week.

<podcast:transcript> rel attribute's necessity is unclear

The rel="captions" attribute on <podcast:transcript> seems to be redundant. From the spec:

If the rel="captions" attribute is present, the linked file is considered to be a closed captions file, regardless of what the mime type is.

This seems to imply rel="captions" is shorthand for type="application/srt", though it only introduces ambiguity. I would strongly suggest removing it unless a clear need for it presents itself.

<podcast:funding>

I would like to suggest the inclusion of a funding tag, that could be added multiple times.
it should be added under channel.

I suggest the following format:

<podcast:funding platform="[platform slug]" url="[url for the show at the platform]" name="[platfrom name]">[podcast handle at the platform]</podcast:funding>

example: <podcast:funding platform="patreon" url="https://patreon.com/somecast" name=Patreon>somecast</podcast:funding>

There is some redundancy, but it is intentional, the platform name and url are present so that clients can show information and navigate to it without knowing anything about the platform. The platform slug and podcast handle at the platform would allow for a better integration in case of existing APIs.

Additionally, this tag could support funding with cryto currencies with the same format, example: <podcast:funding "wallet.XBT" text="Bitcoin">1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2</podcast:funding>

Reference: https://github.com/socialrss/socialrss/blob/master/DOCUMENTATION.md

proposal: <podcast:images> for multiple image sets

I don't like current tags at all. They feel wrong. Clearly there is a need for alternate image sizing. But, I'm thinking we make it operate more like "alternateEnclosure" and bring in the concept of a srcset like html images:

<podcast:alternateImage 
  srcset="https://podcastindex.org/images/pci_avatar.jpg 1500w,
          https://podcastindex.org/images/pci_avatar.jpg 600w,
          https://podcastindex.org/images/pci_avatar.jpg 300w,
          https://podcastindex.org/images/pci_avatar.jpg 150w" />

That reduces to a single tag and is very compact. Would this be workable?

<podcast:location> Web scale IDs

I like the use of a country code, but would propose to firm that up to a standard.

Some podcasts are "from" a big city; others from a suburb in that city; others from an actual place. A lat/lon doesn't tell us anything about the place being specified, nor its bounding area.

My suggestion

<podcast:location osm_id=[place ID]>[CountryCode|Locality]</podcast:location> - The ISO 3166-1 alpha-2 country code; a pipe as separator; then a humanly-readable place name as preferred by the publisher. The (mandatory) parameter osm_id is the OpenStreetMap ID for that place, using OpenStreetMap's API.

The use of OpenStreetMap's ID here - OSM is a well-recognised, open and free database - would allow you to programmatically get lat/lons if required; show place names in other languages, grab a rough bounding box to aid in searches, and understand the place type. (Use the osm_id and not the place_id).

Here is a programmatic call for osm_id 182706, which includes

[{"osm_id":182706,
"boundingbox":["39.8086936","40.157272","-83.2101797","-82.7713119"],
"lat":"39.9622601","lon":"-83.0007065",
"display_name":"Columbus, Franklin County, Ohio, United States of America",
"class":"boundary","type":"administrative","importance":0.6294399106820546,
"address":{"city":"Columbus","county":"Franklin County",
   "state":"Ohio","country":"United States of America","country_code":"us"}}]

Valid examples

<podcast:location osm_id="182706">[US|Columbus, Ohio]</podcast:location>
<podcast:location osm_id="182706">[US|Columbus, OH]</podcast:location>
(These are both the same place)

<podcast:location osm_id="1850813">[US|Columbus]</podcast:location>
(This is actually Columbus in Georgia)

<podcast:location osm_id="65606">[GB|London]</podcast:location>
<podcast:location osm_id="7646215719">[GB|Westminster, London]</podcast:location>
<podcast:location osm_id="1567699">[GB|Houses of Parliament, London]</podcast:location>
The above might all describe the same podcast, but there's significant benefit to use the last one. You might know it as the Houses of Parliament, but the correct name, as OSM will tell you, is the "Palace of Westminster". The "bounding box" all of these gets progressively smaller, so a podcast for "London" is very different for one just for "Westminster".

<podcast:location osm_id="175905">[US|New York]</podcast:location>
<podcast:location osm_id="8398124">[US|Manhattan]</podcast:location>
The first one is actually New York City, rather than New York State, which can often be confused.
But what if the podcast is actually about Manhattan, and not Brooklyn?
For the NYC listing, it has "boundingbox":["40.477399","40.9161785","-74.25909","-73.7001809"] which covers all of New York City. But for Manhattan, you can get "boundingbox":["40.6839411","40.8804489","-74.0472219","-73.9061585"] which is just the Manhattan area.

<podcast:location osm_id="65606">[GB|London]</podcast:location>
An app configured in French can programmatically grab the right name in that language and return "Londres, Angleterre, Royaume-Uni"

Data requirements

OSM is open. You can (and should) maintain your own database, but there are plenty of mirrors around the world. It does have a licence which requires credit. It would be helpful to understand what the deal is here with RSS feeds. I'm a member of the OSM Foundation, so I could go and peek a little more. Alternatives are Google Maps (proprietary), or GeoNames which still has an attribution requirement, and looks to be relatively un-maintained judging by the traffic in its forum.

Hope this shows the benefits of not just using raw text. Let's do this properly! ;)

<podcast:email> Is it really necessary?

RSS 2.0 already has the channel level elements managingEditor and webMaster.

webMaster is defines as:

Email address for person responsible for technical issues relating to channel.

It seems suitable or the current purpose described for <podcast:email>:

This is a channel-level element. An email address that can be used to verify ownership of this feed during move and import operations. This could be a public email or a virtual email address at the hosting provider that redirects to the owner's true email address.

I don't have a strong opinion on which should take priority, managingEditor and webMaster.

Source: https://www.rssboard.org/rss-specification

<podcast:location>'s purpose is not well-specified

<podcast:location> does not define when or how it should be used. It does not say, for instance, whether it should be used to describe where a podcast is recorded, the location a podcast talks about, or some other value.

In the case of an item, where the spec allows multiple location tags, it is unclear how a consumer would disambiguate multiple locations.

Additionally, no guidance is provided on the precision of the latlon value: latlon="7.0,-2.0" is perfectly valid, but represents an area up to over 100x100km large. Guidance about precision should be defined to both specifically describe the intended location, but also to protect privacy by avoiding unnecessarily detailed information about persons.

Since the use of this element is unclear, I'd strongly recommend removing <podcast:location> from the spec until a compelling need for it presents itself, whereupon additional metadata attributes can be specified for the tag, or updated to include missing context.

<podcast:contact> allows for multiple elements of the same type attribute

The spec should require that at most one <podcast:contact> exists within a <channel> for a given type attribute.

Currently, the following is valid:

<podcast:contact type="feedback" method="link">https://example.com</podcast:contact>
<podcast:contact type="feedback" method="link">https://foobar.com</podcast:contact>

There is no way to disambiguate these links, except to expose both of them.

I strongly recommend that the spec require that up to one tag for each type can be defined.

podcast:id needs URL

Service slugs historically in open source are exclusionist.

Here is some context

https://wordpress.org/support/article/embeds/

It is really hard for a video or podcast service with a cool player to get on that list.
For oEmbed to work, you have to either get a blog owner to reduce security or install a custom plugin.
Even embedly (now owned by Medium) hasn't managed to get on that list

The open-source project becomes a gatekeeper for what is acceptable use

You can have a list of common slugs, but anyone should be able to just use a new one for their service
You need to have a URL

In that way even if the gatekeepers of the common list don't include you, it can still work.

Thus podcast:id needs to follow the same format as podcast:funding & podcast:social

It could be argued that if you have the URL, then the service slug is actually redundant and adds maintenance overhead
Exploding a URL for a service name isn't rocket science.

podcast:funding you are going to end up with people defining 100 different options... just wait. Affiliate links to everything they can think of, links to their sales page for various products etc.

But open standards are better than exclusion

Capitalization vs. hyphenation on multiword tags.

Maybe this has already been decided, but I think we need to have a clear standard on capitalization vs. hyphenation. For example:

<podcast:newFeedUrl> vs. <podcast:newfeedurl> vs. <podcast:new-feed-url>

The RSS spec favors lower camel case (likeThis), but the "iTunes" spec favors hyphens. The "Google Play" spec doesn't have multiword tags.

I think we should stick with lower camel case.

Generic vs. specific slugs

Related to #23, I think we should adopt a standard for whether to use company names or product names.

For example, apple vs. applepodcasts (or apple-podcasts).

I see potential problems with either, so it's a matter of which problems we're willing to accept.

Take Google, for example, the PostIt Notes of products. They've had Google Listen, then nothing, then Google Play Music, and now Google Podcasts. So google-podcasts might be fine for now, but it could change again in the future. So this would seem like simply google would be the better choice.

On the other side, what if a platform branches? For example, Amazon, Amazon Music, Amazon Podcasts, Audible.

I think generally adopting a company / overall platform name will usually be more compatible and won't require rebranding. So google for Google Podcasts, apple for Apple Podcasts, amazon for Amazon Music, and so on. Where a specific app/platform name would be used is when that is the recognized of the company/platform, such as podcastaddict or stitcher.

And we would also need to standardize whether to use hyphens (apple-podcasts), camel case (applePodcasts), or jam it all together (applepodcasts).

podcast:alternateEnclosure

This is a great idea, and one of the reasons I've wanted to consider a rather more breaking change to RSS.

1 - Questions:

  • Can you have more than one alternateEnclosure? A use-case might be a video podcaster with a high and low bitrate video feed and an audio feed. Same feed, same show, different media files.
  • What is the default? Is it the standard enclosure? Or...?

2 - Suggestion: Could I please suggest an optional "title" attribute for UI purposes? While it may be programmatically possible to guess on "low quality" and "high quality", or to choose video vs audio, it's harder to offer accessible versions of the same file - a video with captions added, for example, or a video with audio description for people with sight difficulties?

One example:

<podcast:alternateEnclosure type="audio/x-m4a" length="1540076" bitrate="80" title="Audio - low bitrate">https://chtbl.com/track/podnews.net/audio/podnews201009.m4a</podcast:alternateEnclosure>
<podcast:alternateEnclosure type="audio/mpeg" length="2534548" bitrate="160" title="Audio - high bitrate">https://chtbl.com/track/podnews.net/audio/podnews201009.mp3</podcast:alternateEnclosure>

An example for video

<podcast:alternateEnclosure type="video/mp4" length="11540076" bitrate="512" title="Full video version">https://example.com/video.mp4</podcast:alternateEnclosure>
<podcast:alternateEnclosure type="video/mp4"" length="11540076" bitrate="512" title="Video with audio description">https://example.com/video-ad.mp4</podcast:alternateEnclosure>
<podcast:alternateEnclosure type="video/mp4"" length="11540076" bitrate="512" title="Video with captions">https://example.com/video-capt.mp4</podcast:alternateEnclosure>

Chapter Markers

I'm documenting this here for discussion. The concern about the impact to the size of the RSS feed as well as the ability to extend what is accomplished with chapter markers has led to the proposal of separating the chapter markers into its own spec.

In the RSS Feed under:
<podcast:chapters url="htttp://podcasthost.com/episode/123/chapters.json" />

The actual Chapter JSON spec could be debated separately and could provide more room for growth and innovation independent of RSS.

xmlns document

What do you all think about moving entries into a 1.0 XMLNS document as things are agreed upon? This would allow us to begin rolling out some of these standards and include an authoritative XMLNS header. Buzzsprout will be adopting this standard for transcripts and could link to the document from all of our RSS feeds.

Note on "keep existing conventions"

For example, it would make sense to turn <podcast:explicit> into a unary element, where it's existence is taken as a "yes" and it's absence as a "no". But, that has never been the standard.

That's not quite true. For Apple Podcasts (although they've flip-flopped a couple times), a show can me "explicit" (yes), "clean" (no), or unmarked (no tag). It seemed the way the community used these tags provided some good understanding of the maturity level of the content. For example, "clean" would be appropriate for children, "explicit" would be adults-only stuff (content or language), while unmarked would be PG or PG-13 level.

Along these lines, perhaps a <podcast:contentRating> tag would be better and it could follow general G, PG, PG-13, R, and X ratings. Or, you could follow the TV ratings style to specifically indicate what content is in the show/episode, like violence, language, sex, etc.

Podcast chapters JSON format should consistently match ID3v2 spec

The ID3v2 chapters spec defines a number of useful values which should be mirrored in the JSON spec:

https://id3.org/id3v2-chapters-1.0

  • endTime: Allows chapters to be non-adjacent. E.g., for ad breaks, post-roll, etc.

The chapters header also defines sub-frames which allow additional information. Besides a title, chapters can have other metadata like speakers, a description of the chapter, and more. It may be useful to allow a metadata member for chapter objects, which is defined with the following (TypeScript) definition, using template literal types:

type ProprietaryPrefix = string; // e.g., "pinecast"

type Metadata = {
  type: 'description' | `-${ProprietaryPrefix}-${string}`;
};
type Chapter = {
  startTime: number,
  title: string,
  img?: string,
  url?: string,
  metadata?: Array<Metadata>,
};
type ChaptersDocument = Array<Chapter>;

Rich content in Chapters

Making rich user experiences possible in podcasts

Now that we have <podcast:chapters> I would like to open up a discussion about the format of the JSON file it can point to, and how we could potentially use that to enable some super awesome stuff.

My motivation for actually starting Podfriend in the dawn of time was that I wanted to enable "rich content". Originally I thought for podcasts to be able to do it through a web UI that I made, but I think we have the chance to go much further than that.

An example
Here is an example of an experience I would like to support: https://apps.npr.org/lookatthis/posts/lovestory/ (you might have to go back to the URL after accepting cookies)

What we have today

For the sake of everyone being on the same page, this is what we have today for the element.

  • startTime (required - float) The time, expressed in seconds with float precision for fractions of a second.
  • title (required - string) The title of this chapter.
  • img (optional - string) The url of an image to use as chapter art.
  • url (optional - string) The url of a web page or supporting document that's related to the topic of this chapter.

I think the meaning behind these are fine, and still valuable. However the "img" and "url" itself are very limited. Basically what we're expressing here is that while listening to a podcast, the most advanced thing we want to show the listener is an image, and that they can click on that image to go to a URL.

Now, I have 3 ways that I could see making "Richer content" possible.

The simple way

The first and most simple (and I might be leaning towards this) would be for players to send the current progress of the podcast to the "url". That way the HTML page itself could do all the heavy lifting, and basically anything would be possible. But this does require the player to always basically iframe/webview in the URL, and pass the progress with a postMessage to that resource.

The easier way

The other way would be to keep the existing attributes, make title optional and add an endTime as well as an htmlElement string attribute, that would simply be along the lines of:

{
startTime: 10,
endTime: 50,
htmlElement: '<video src="https://example.org/example.mp4" style="min-width: 100%;min-height:100%;z-index:0;" />'
},
{
startTime: 10,
endTime: 50,
htmlElement: '<a href="link.html" style="position: absolute;">Click here to donate!</a>'
}

Of course the players would most likely have to do some whitelisting of tags here, but in general it would be pretty powerful.

The hard but most locked down way

If we don't want to enable any and all HTML tag, then the most platform agnostic and locked down would be to have the htmlElement be much more defined (and we can remove the "html" part):

  • element (optional - struct) if present this is meant to be shown as "rich content" during the podcast.
    • type (optional - string) Can be video, image, text, link, form, formElement
    • id (optional - string) Unique identifier
    • parent (optional - string) Unique identifier of parent component, used for arranging content
    • style (optional - string) CSS styles
    • resource (optional - string) Remote location of the resource (eg. of the video or image, target of a link or form)

Here is an example of what I imagine

{
  "version": "1.0.0",
  "chapters":
  [
    {
      "startTime": 0,
      "title": "Intro"
    },
    {
      "startTime": 0,
      "endTime": 50,
      "element": {
      	"type": "video",
      	"style": { "min-width": "100%", "min-height": "100%", "z-index": 1 }
      	"src": "https://myvideo.com/example.mp4"
    },
    {
      "startTime": 0,
      "endTime": 50,
      "element": {
      	"type": "h1",
      	"content" => "Welcome to the podcast!"
      }
    },
    {
      "startTime": 0,
      "endTime": 50,
      "element": {
      	"type": "link"
      	"href": "https://www.podcastindex.org",
      	"content": "Visit our site and give a donation!"
      }
    }
  ]
}

I see pros/cons with all 3 approaches. I think the first and easiest solution is the one that would get the most adoption in the least amount of time, and would basically require no changes to the index (but would require that players took the lead), so I am personally leaning towards that.

Proposal <podcast:soundbite>

I think we should consider giving podcasters a way to identify one or more soundbites at the episode level (they already have trailers for the overall podcast). Apps could use these to generate episode previews, they could highlight them on their main browse screens, or they could build discover functionality like the old FM radio scan mode (check out the Shuffle app). I'm sure someone could also come up with some clever ways to use this to promote episodes on social sites...

<podcast:soundbite startTime="[123]" duration="[30]" />
OR
<podcast:soundbite startTime="[123]" duration="[30]">[Title of Soundbite]</podcast:soundbite>

Item (optional | multiple)

A short extract from the episode, chosen because it summarized the episode and/or persuades further engagement.

  • startTime (required) The time where the soundbite begins
  • duration (required) How long is the soundbite (recommended between 15 and 120 seconds)
  • node value (optional) Used as free form string from the podcast creator to specify a title for the soundbite (otherwise default to episode title)

<podcast:newFeedUrl> tips

<itunes:new-feed-url> and feed-redirecting are commonly misunderstood by many podcasters (who really shouldn't have to worry about the tech). It seems like some of those misconceptions might be present here, so I want to help clarify.

It doesn't matter what you put in an RSS feed when it's being 3xx-redirected. The feed contents won't load because the header redirects.

But not all podcast-publishing tools hosting providers allow redirects. That's where <itunes:new-feed-url> can be used as a pseudo-redirect, but it works for only a few podcast apps and is thus not nearly as effective as a 3xx redirect. It also requires that the feed be accessible and that tag be visible, whereas a 3xx redirect can be placed on a URL with absolutely no file present.

The other way Apple Podcasts and iTunes use <itunes:new-feed-url> is to confirm a podcast's new feed. Apple even recommends using it this way.

Here's a typical scenario: Redirect /oldfeed to /newfeed. Then place <itunes:new-feed-url> on the new feed pointing at itself. This helps Apple Podcasts (and maybe some other apps) to more quickly confirm the new feed and change their catalog.

XML Syntax Guidelines

I get that XML is ugly, but this is the language of RSS. Therefore, we should be writing the best XML possible.

Current syntax:
<podcast:funding platform="[service slug]" title="[user provided note (string)]">[url for the show at the platform]</podcast:funding>

Proposed syntax:
<podcast:funding platform="[service slug]" url="[URL to funding platform]">[user provided content to be linked]</podcast:funding>

The guiding principle here is that XML is intended for computers and people. The content that's intended for computers, should be included in the element. The content that's intended for people, should be included between the elements opening and closing tags.

Empty elements must still be closed. To be as concise as possible, I propose we agree to the short-hand syntax (I know it's gross).

Empty element syntax:
<podcast:transcript url="https://podcastindex.org/ep0002/transcript.json" type="application/json" language="es" rel="captions" />

<podcast:transcript> format support is underspecified

Transcripts should prescriptively define transcript format support. HTML5, for instance, requires certain formats (e.g., WebM) to be implemented by browsers to be considered compliant. This allows both browsers and developers to produce files which work for an overwhelming majority of users. By underspecifying transcript format support, the tag does not reduce fragmentation.

I recommend requiring support for the following MIME types:

  • text/plain - Displayed as plain text
  • text/html - Displayed as rich text, with minimum support for basic formatting elements (e.g., p, div, br, b, etc.)
  • application/srt
  • text/vtt (WebVTT - Displayed in an appropriate way, formatting may be ignored

I would strongly recommend against specifying a new JSON file format as part of this spec.

Feelings on <podcast:alternateEnclosure>?

How are we feeling about this tag? I've seen very little comment on it, but I know it's something that has been mentioned as needed for alternate bit rates and such. I see it as being used very infrequently. But, when needed, could be a great help. Is there any downside to this, being as how it's optional?

Consider requiring HTTPS for all new tag URLs

When loading content like chapters, transcripts, etc., consider requiring HTTPS. Apple and others have indicated that they will require feeds to be served over HTTPS in the future. All modern browsers currently warn web users when they are visiting pages which are not served over HTTPS.

Requiring HTTPS provides strong privacy guarantees. Podcasting's distributed nature guarantees some amount of anonymity. However, it's conceivable that ISPs, malicious VPN providers, wifi access points, and others would sniff traffic to serve targeted ads based on content seen in podcast assets. HTTPS support is essentially 100% for all podcast hosting providers, and an overwhelming majority of podcasters today, so this constraint introduces no significant burden to any podcast producer.

<podcast:alternateEnclosure> differs from RSS spec

<podcast:alternateEnclosure> specifies the length attribute to be "Duration of media asset in seconds". Per the RSS spec, length is defined as:

length says how big it is in bytes

Consistency is important here: some podcast hosts have in the past (and currently) provided incorrect values for this attribute.

Improper conflations between "platform" and "host"

I see some potential misuses of the term "platform." For example, Apple Podcasts, Google Podcasts, Spotify, and such are platforms. But Captivate, Fireside, and such are hosting-providers.

Also, there can be easy confusion with the word "host." For example, is "podcast host" the person hosting the podcast, or the company hosting the media files? The way I distinguish these two things is using "podcast host" to refer to the person and "podcast-hosting provider" to refer to the company.

Canonical home and other links

I am currently an outsider to Podcasting looking in.
I have done a fair bit with video SEO, enclosures, oEmbed etc.

Historically there is the concept "All roads lead to Rome"
In SEO we think of "All roads lead to Home"

  1. Home

I know you have links for the author home, and the transcript, but that isn't the home page of the actual Podcast.
i.e. the super optimized page just about the Podcast with the perfect layout, subscribe links in the order of your preference, email capture etc.
You need to have something for this.

  1. Subscribe links

You should be able to specify where your Podcast resides on various services and your preferred order.
There are key issues here

  1. Platforms die... you might want to prioritise services that will still be around in 5 years e.g. Apple and Spotify
  2. Different services pay more per stream. Hinting for them to use 1 service over another can affect your income. e.g. Spotify then Apple

Service migration

If you are doing this namespace, you might also want at the same time as a parallel initiative ensure a standard for export and import of subscriptions between platforms

Need function comparable to robots.txt, but to guide podcast indexers

This namespace element was motivated by a situation where a podcast approached podcastindex, noting that they had two feeds for their podcast (one for most people, another motivated by their Chinese audience), and wanted the latter to be omitted from podcast index. This request was forwarded, and handled manually. But it would be nice if a podcast RSS feed could directly annotate whether it should be indexed, without human intervention.

This problem has similarities to the robots.txt file, used to guide whether search engines should index a site in whole, part, or not at all. But since an RSS feed is complete unto itself, most likely one or so XML fields is preferred to a separate file.

The goal is to permit a podcast's RSS feed indicate which indexers should scan it and publish results. Alternatively, it might be helpful to indicate which indexers should not scan/index it.

Each podcast indexer is probably associated with a standard domain name (spotify.com, podcastindex.org, etc.). This pattern match allow/disallow will most likely operate with respect to this domain name.

This ticket will serve as a focus point for discussion of the mechanism. When RSS element name(s) are published (or some other mechanism is accepted), it can be closed.

<podcast:funding> does not consider prior art

There are multiple examples of prior art related to <podcast:funding>. The tag in spec very closely aligns with RawVoice's <rawvoice:donate> tag. The most notable (and well-supported tag) is RadioPublic's:

https://radiopublic.com/schema/1.0/#action

RadioPublic provides <rp:cta> and <rp:action>, which allow you to craft a tag like this:

<rp:cta headline="Headline" subtitle="Subtitle">
  <rp:action class="financial" disposition="positive" href="...url..." label="Donate" />
</rp:cta>

This implementation has the following advantages:

  • Custom verb for financial contribution
  • Not financial donation-specific (class can be extended to other CTA types)
  • Context can be added beyond the scope of a single button or link (with headline and subtitle)

I'd suggest adding additional attributes or tags to allow podcatchers to create UIs that are more expressive than a single button or link.

I'd strongly recommend replacing the wording "string length" with "string character length", and increasing it beyond 128: some languages which make heavy use of diacritics may be artificially limited to short CTAs, which biases the spec towards Latin-derived languages.

podcast:imageLarge and variants

These image variants are marked as:

This is assumed to point to an image that is 300px to 999px in size.

a) As a developer, I'd really like to have standard sizes, not rough hand-waving "somewhere in this range". If this is deemed unacceptable... I know we want to avoid attributes, but it would be wonderful to have size="666" in here, so that we know what size the image is before retrieving it.

b) It would be very helpful, please, to stipulate that images MUST be square.

c) I'm assuming that the image does not need to be the same for these sizes. This is a good thing if so - but it should be documented if that's acceptable.

The use-case is that

32x32 icon

...works really well in tiny sizes, while

podnews_200x200

...works better in larger sizes.

podcast:funding direct (Bitcoin Lightning) payment requests

Hi!

Would podcast:funding be able to hold a lnurl [1]? It's an encoded Bitcoin Lightning payment request; so not a a http link per se.

Would look like:

lightning:LNURL1DP68GURN8GHJ7UM9[..]CMYXYMNSERXFQ5FNS

and ideally the wallet app of the user would open this url. Possibly a special attribute would be needed to detect "direct payment requests" instead of trusting it's a web link. There will possibly be more platforms so I'm thinking having a bitcoin/lightning-specific tag might not be the best idea.

Wonder what you think. Would be cool to have this implemented in the first new namespace. I think being able to pay a podcaster directly from a bitcoin lightning wallet would be great.

Thanks for considering!

[1] https://github.com/btcontract/lnurl-rfc

is the "chapters" container necessary?

Is <podcast:chapters> necessary? It has no attributes, and just functions as a container for <podcast:chapter> nodes. Do the chapter nodes have to be "contained"?

tag for social media

I would like to suggest adding a tag for social media links for the podcast.
I should be allow many times and be located at the channel tag.

It could have this format:

<podcast:social platform="[platform slug]" url="[link to social media account]" name="[name of the social media platform]">[social media handle]</podcast:social>

example:

<podcast:social platform="twitter" url="https://twitter.com/mypodcast" name="Twitter">mypodcast</podcast:social>

Reference: tag handle at https://github.com/socialrss/socialrss/blob/master/DOCUMENTATION.md

More audio/video metadata in feed?

Like chapters and related to this conversation, I think we need to consider bringing some information usually only in the enclosure file into the feed.

Primarily, I'm thinking about duration.

Unfortunately, <enclosure length="โ€ฆ"> is not reliable. It's the size of the file in bytes, and you can't determine length from size (get your mind out of the gutter!).

For example, 64 kbps and 128 kbps audio will have significantly different length values even if the audio is the same duration.

Also, two 64 kbps files could have different byte sizes depending on how big the ID3 header information is (significantly affected by the embedded image).

In addition to duration, maybe there would be the need to clearly indicate whether it's audio or video.

So maybe we need something like <podcast:episodeMedia> or <podcast:mediaMeta> with sub-elements for duration.

Or merge these needs with the expanded enclosure model so each enclosure can have more metadata for it.

Captions and dynamic content

Just a question (because I've not been following it very closely)...

Captions - a transcript with a timecode... how does that work with dynamic content insertion, like ads or other content?

If a podcast starts with a 30" shouty ad for a VPN that has been dynamically inserted for listeners in the US, what happens when I listen in Australia and don't get that ad? Are all the captions thirty seconds "out"?

(I wonder how we might fix that, if so. Do the captions go through the ad server along with the audio?)

<podcast:alternateEnclosure> should not be allowed at the channel level without more work

The spec allows for <podcast:alternateEnclosure> to be present in the <channel>. While this is a fun idea, it has a number of challenging problems that deserve more attention before formally allowing it.

For one, the value isn't an "alternate" to anything. It's just an enclosure. Semantically, the tag makes no sense.

Some RSS extensions already support the channel-level case with other tags, which provide more prescriptive guidance on their use:

  • RadioPublic has the <rp:gateway-episodes> and <rp:greatest-hits> elements, which reference the GUID of an <item> https://radiopublic.com/schema/1.0/#gateway
  • Apple's spec allows for an epiosde type to be defined on an <item>, which allows a full <item> to describe trailers and bonus content. A more expressive variant of this would allow the show's content to be visible in a backwards-compatible way to all podcast apps, and make that content mobile between hosts that do not support this spec.

Consider adding payment method support on <podcast:funding>

It should be possible to note which payment method types are supported for a given method of funding. Users are strongly motivated to use payment methods which are well-supported in their area. PayPal availability, for instance, may drive South American users to make contributions. Credit Card support may drive US and UK users to make contributions. Some countries may have very high non-card payment method adoption (iDEAL, SEPA, FPX, WeChat Pay, etc.).

Transcripts

  1. I would suggest that podcast:transcript and podcast:captions are really the same thing provided in different formats. The way we have approached it with Buzzsprout makes use of the XML type. I know this may go against your Goal #2, but it really does accurately capture what is being represented and avoids creating a new tag when people want to make use of another format for transcripts e.g. JSON, WebVTT.

  2. Transcript language seems redundant with the language of the podcast which may be better captured with podcast:language.

<podcast:chapters> should have a mandatory media type

Allowing any chapter format provides no prescriptive guidance to podcast apps for required support. If a podcast hosting service serves chapters with a text/plain file, it's not useful to podcast apps (unless they manually implement this support). Strongly defining support guarantees that implementors of the tag (for both hosts and apps) all speak the same language.

I propose defining a media type:

application/audio-chapters+json

This media type should represent the chapters format specified in the jsonChapters.md file. Unlike application/json, which represents any JSON format, this specifically requires support for the format defined in this spec. This is consistent with the media type for RSS, application/rss+xml.

I strongly recommend against allowing any media type, or an ambiguous JSON media type.

Consider removing method attribute from <podcast:contact>

<podcast:contact> defines a method attribute, which overloads the behavior of the tag to support two different types of values: emails and URIs. However, support for email addresses is already well-supported with the mailto: scheme. The specification can be simplified by removing the method attribute and requiring app support for the mailto: scheme, which should trigger the consumer's email client.

Additionally, the specification should limit the URI schemes supported by the tag: data: and file:// are likely security risks or could lead to unintended abuse of the spec.

platform attribute on <podcast:funding> should be removed

The platform attribute on <podcast:funding> should be removed. Podcatchers can use this attribute to hide links to support podcasts on platforms which the developer does not agree with.

Consider the (hypothetical) case where Spotify were to use this attribute to ignore <podcast:funding> tags which set platform="patreon" to drive podcasters to use Anchor's donation platform instead.

Service slugs do not meaningfully improve the user experience and expose information which could be abused by developers.

podcast:contentRating -- overlaps with Explicit?

(copied from podcastindex.social)
podcast:contentRating .. I have thoughts about this, we already have <(itunes|googleplay):explicit> to signal that listening might require a mature/adult audience. The suggested rating grades are also very local to the USA, while listening is global. Most European countries has age-levels but for movies ... never heard of anything for podcast or radio

<podcast:transcript> different formats / support for captions

A transcript might be in HTML and have formatting applied to it. That is best handled as a link to an external website.

A transcript might be in raw text, without formatting. That is best handled internally within a podcast app.

A caption is also a transcript - but with additional timing information within it. My suggestion is that this might live in this as well.

As a suggestion therefore (and I don't know how to deal with more than one):

<podcast:transcript type="text/html">[url to a website]</podcast:transcript>
<podcast:transcript type="text/plain">[url to a text file]</podcast:transcript>
<podcast:transcript type="text/srt">[url to a SRT captions file]</podcast:transcript>

This would enable this tag to be used for captions as well as transcripts, and would add flexibility. Of course, a text/plain file can be viewed in a web browser as well; though a text/srt file (or the text/ts file) may be a little harder to do so.

As an addition - I wrote an article about the differences between transcripts and closed captions on Medium (that's a friend link and will go past any paywall). They are different, but I think it can work here to combine them into one tag, given they're different filetypes by necessity.

Should the tag <podcast:locked> have a value?

Since this tag has a boolean behavior, it could be empty and its meaning could be taken simply by it existing or not in the channel, it would not be necessary to specify a yes or no value.

I can understand though that with the current format it is more human readable.

Do we need <podcast:newFeedUrl>?

There is already the commonly used <atom:link href="[Feed url]" rel="self" type="application/rss+xml" />

podcast:previousUrl is very good to have, my only concern is that it may not be verifiable and people could use it in an unreliable and/or harmful way - i.e. I list in my feed as a previous url the current or previous url of another show (ok, if that show is verified, this would be easy to detect and resolve, but still worth mentioning for unverified shows).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.