Greetings!
I think it would be a great addition to add http://schema.org/VideoObject markup to the VideoPress output so that the content is better indexed by Google, et al. I would be happy to do a Pull Request as I believe I should be able to do the bulk of the coding myself (with a few notable exceptions). But I wanted post first to provide opportunity for feedback in case there are important things I should take into consideration. At the moment I have some free time this week so I'd love to go ahead and knock out what I can soon. :)
Below is an example of my proposed changes. This example is for the html5_dynamic
player, but the same ideas should extend to the other forms of output as well (ie, html5_dynamic
, flash_object
). For clarity I'm omitting unimportant details and elements.
<div id="v-GUID-1" class="video-player" itemprop="video" itemscope itemtype="http://schema.org/VideoObject">
<div id="v-GUID-1-placeholder" class="videopress-placeholder" style="...">
<div class="videopress-title" style="..." dir="ltr" lang="en"><span itemprop="name" style="...">This is My Video File</span></div>
<img itemprop="thumbnail" class="videopress-poster" alt="This is My Video File" title="Watch: This is My Video File Title" src="..._scruberthumbnail_1.jpg">
<div class="play-button">...</div>
<div style="...">...videopress.png...</div>
<script type="text/javascript">...</script>
<noscript><p>JavaScript required to play <a itemprop="contentUrl" hreflang="en" type="video/mp4" href="http://videos.videopress.com/GUID/testfile_dvd.mp4">This is My Video File</a>.
Video Description: <span itemprop="description">This is the description for this video!!.</span>
</p></noscript>
<!-- Or perhaps put Description here? -->
<div class="video-description">
<span itemprop="description">This is the description for this video!!.</span>
</div>
<div class="video-transcript" itemprop="transcript">
<h5>Video Transcript</h5>
<p>Lorem ipsum dolor sit amet, <b>cyrenensi</b> reversus ait in lucem. Autem
est cum unde ascendit. Quantum est in fuerat est Apollonius ut libertatem
petitiones tulit animo suscipiens secrete fugio naves. Mea ego illum decidat quam
dolore obiectum invidunt kasd obiectum ait est cum. Opto cum suam ad nomine
Maria cum obiectum invidunt kasd habeo in lucem concitaverunt in deinde vero
diam nostra praedicabilium subsannio.</p>
</div>
<meta itemprop="duration" content="T1M33S" />
<meta itemprop="uploadDate" content="2011-07-05T08:00:00+08:00" />
<meta itemprop="expires" content="2012-01-30T19:00:00+08:00" />
</div>
For the above I'm using Schema.org and https://support.google.com/webmasters/answer/2413309?hl=en as references (but in places where Google differs from Schema, such as thumbnailURL
vs thumbnail
I've gone with Schema). Per my testing with http://www.google.com/webmasters/tools/richsnippets the above code is correctly recognized (with real code in the place of "..." of course).
Most of the above is pretty straightforward. However there are several items that do require further discussion:
<meta itemprop="uploadDate" content="2011-07-05T08:00:00+08:00" />
: This is not currently available via https://github.com/Automattic/jetpack/blob/master/modules/videopress/class.videopress-video.php. My proposal is that we use the Attachment Upload Date which is stored in WordPress. The problem though is that this is of course stored in the associated WordPress.com account, and so not readily available for accessing. uploadDate
is Recommended by Google, but not Required. However, there are some other items for which we have the same quandary (ie, how do we get access to the WP.com data).
2a) <span itemprop="description">
: My proposal is that we use the Media Caption attribute to store this data. Though this is a little linguistically confusing in the context of it being a video file (ie, using the Caption field to store Description data) it is consistent with things WordPress is already doing (eg, if you look at the Attachment page for a video on WordPress.com the og:description
tag uses the Caption for its data). I also think this makes sense in that a Caption is usually short and without formatting (thus the absence of formatting options for the Captions field in the WP interface), and that seems consistent to me with the idea of the video Description.
2b) description
is a Required field for Google, but we currently don't have any way (that I know of) for accessing the Caption field data for a given video. We do currently have access to the $post_id
(https://github.com/Automattic/jetpack/blob/master/modules/videopress/class.videopress-video.php#L42), but we would need some further method of using that to access the Caption field for a given video. I have no idea how much or little might involved in that (ie, whether it might involve significant work outside of the Jetpack plugin), and this is obviously something that I would need someone's help with.
2c) description
Placement: Any thoughts on placement of the description
data? My first thought (which I've done above) was to put it in the <noscript>
tag as then it is useful to those without JS. But since then I've thought that some users may wish to actually display the Description and putting it in the <noscript>
tag is not conducive to that. So I do wonder if it might be better to put it outside (and perhaps directly after) the <noscript>
tag and just give it a display:none
by default. Then any VideoPress users that want to display it can just target it with CSS. I believe that Google will treat it the same either way in this case (I know that in some cases Google doesn't like display:none
, but in this case given that the Schema data can also be presented in hidden <meta itemprop
markup the use of display:none
should be fine, and not considered "cloaking").
3a) transcript
Mapped to WP Attachment Description Field: Similar to the above discussions, I propose that we use the Description field for the WordPress Attachment to store the transcript
data. Though again a little linguistically confusing, I think it ultimately makes the most sense in that a transcript
is likely to be much longer than a description
(stored in the WP Caption field), and also much more likely to benefit from rich text (which the WP Description field supports). I think that being able to add transcript
data to the page displaying one's video would be very beneficial both for Google and also for users (eg, they will have a higher likelihood of finding a video by searching, and there is also opportunity for website owners to creatively display the transcript
data to users that want/need it).
3b) While transcript
is not required by Google, I do think it would be extremely useful to have. However, just as with description
and uploadDate
I don't yet know how to access the WP.com Description Field data for a given video/attachment via VideoPress in Jetpack. It seems like this should be possible, but I'll definitely need some help here.
3c) <div class="video-transcript" itemprop="transcript">
: I went ahead and proposed the class video-transcript
as it seems in keeping with the other classes that VideoPress uses. Like with the description
I think it might be best to use display:none
by default, and then let the user override that with their own CSS.
-
For all of these my proposal is that we'd only output data if we have it. For example, if there is no description
then don't output anything for that (no point in having an empty container).
-
Considerations regarding current use of Caption and Description fields: It does occur to me that some users may be making use of the Media Caption and/or Description fields for their own internal purposes, and so it would be probably be important to have some kind of mass-communication with VideoPress customers about this before implementing the description
and transcript
parts.
-
An alternative to storing (and accessing) description
and transcript
data in the Media attributes would be to enable the user to enter this data into the WordPress post as part of the [shortcode]
. This has pros and cons. A pro is it avoids any issues raised in (5) above, as well as all the potential issues with accessing the Media attributes. A major con is that the description
and transcript
don't follow the video as it is reposted on WP.com, or if the user just decides to post their video in multiple places on their site. But perhaps those can be acceptable losses? If we were to go this route, the description
could be entered as [shortcode description="This is my description"]
and transcript
could be entered by making the [shortcode]
enclosable, eg,
[wpvideo Lp3EoizV description="This is my description"]
<b>Transcript</b>
<p>Here is my transcript... </p>
[/wpvideo]
And so the enclosed content would all get dumped into the <div class="video-transcript" itemprop="transcript">
div, and the user can format the enclosed content however they want. Perhaps if we go this route it would actually be more intuitive to not use display:none
on the transcript
div, as I would think that a typical user would expect that if they entered a bunch of text into the enclosed [shortcode]
that they'd expect to actually see that outputted.
I suppose another pro to the [shortcode]
option is that it is probably more intuitive for most users for them to enter the description
and transcript
in the same place they are posting their video. Even though storing this data in the Media attributes would be more central, I could see it being confusing for the average user (who probably doesn't understand the underlying structure of things). Yet another pro is that I should be able to implement this just fine, whereas with the Media attributes route I'll definitely need help.
The more I think about it the [shortcode]
route might be the best way to go (unless going the Media attributes route is easier than I anticipate). If we did this, then perhaps we'd use the date of the blog post as the uploadDate
, or maybe just leave it off entirely (as it doesn't seem super-important, in my opinion)?
Depending on which route we take, there may be some significant hurdles to implementing the portions of the above that require accessing the Media Attachment attributes. And even though one of those (ie, description
) is considered Required by Google, I still think there would be some benefit to implementing what can be done (and what I should be able to do a Pull Request for just fine) while we sort out the potentially more complicated pieces.
I look forward to thoughts, feedback, suggestions, etc. from whoever handles the VideoPress part of Jetpack, as well as getting to contribute some code to this great project. :)
Sean