Monthly Archives: July 2009

First experiments with itext

My accessibility work for Mozilla is showing first results.

I have now implemented a demo for the previously proposed <itext> element. During the development process, the specification became more concrete.

I’m sure you’re keen to check out the demo.

Please note the following features of the demo:

  • It experiments with four different types of time-aligned text: subtitles, captions, chapters, and textual audio annotations.
  • It extends the video controls by a menu button for the time-aligned text tracks. This enables the user to switch between different languages for the different tracks.
  • The textual audio annotations are mapped into an aria-live activated div element, such that they are indeed read out by screen-readers; this div sits behind the video, invisible to everyone else.
  • The chapters are displayed as text on top of the video.
  • The subtitles and captions are displayed as overlays at the bottom of the video.
  • The display styles and positions are supposed to be default display mechanisms for these kinds of tracks, that could be overwritten by the stylesheet of a Web developer, who intends to place the text elsewhere on screen.

In order to “hear” the textual audio annotations work, you will need to install a screen reader such as JAWS, NVDA, or the firevox plugin on the Mac.

As far as I am aware, this is the first demo of HTML5 video accessibility that includes support for the vision-impaired, hearing-impaired, and also for foreign language speakers.

There have been initial discussions about this proposal, the results of which are captured in the wiki page. I expect a lot more heated discussion will happen on the WHATWG mailing list when I post it soon. I am well aware that probably most of the javascript API will need to be changed, and also some of the HTML.

Also please note that there are some bugs still left on the software, which should not inhibit the discussion at this stage. We will definitely develop a newer and better version.

I am particularly proud that I was able to make this work in the experimental builds of Opera and Chrome, as well as in Safari with XiphQT installed, and of course in Firefox 3.5.

Screenshot of first itext video player
Screenshot of first itext video player experiment

More video accessibility work

It’s already old news, but I am really excited about having started a new part-time contract with Mozilla to continue pushing the HTML5 video and audio elements towards accessibility.

My aim is two-fold: firstly to improve the HTML5 audio and video tags with textual representations, and secondly to hook up the Ogg file format with these accessibility features through an Ogg-internal text codec.

The textual representation that I am after is closely based on the itext elements I have been proposing for a while. They are meant to be a simple way to associate external subtitle/caption files with the HTML5 video and audio tags. I am initially looking at srt and DFXP formats, because I think they are extremes of a spectrum of time-aligned text formats from simple to complex. I am preparing a specification and javascript demonstration of the itext feature and will then be looking for constructive criticism from accessibility, captioning, Web, video and any other expert who cares to provide input. My hope is to move the caption discussion forward on the WHATWG and ultimately achieve a cross-browser standard means for associating time-aligned text with media streams.

The Ogg-internal solution for subtitles – and more generally for time-aligned text – is then a logical next step towards solving accessibility. From the many discussions I have had on the topic of how best to associate subtitles with video I have learnt that there is a need for both: external text files with subtitles, as well as subtitles that are multiplexed with the media into a single binary fie. Here, I am particularly looking at the Kate codec as a means of multiplexing srt and DFXP into Ogg.

Eventually, the idea is to have a seamless interface in the Web Browser for dealing with subtitles, captions, karaoke, timed metadata, and similar time-aligned text. The user interaction should be identical no matter whether the text comes from within a binary media file or from a secondary Web resource. Once this seamless interface exists, hooking up accessibility tools such as screen readers or braille devices to the data should in theory be simple.

Javascript libraries for support

Now that Firefox 3.5 is released with native HTML5 <video> tag support, it seems that there is a new javascript library every day that provides fallback mechanisms for older browsers or those that do not support Ogg Theora.

This blog post collects the javascript libraries that I have found thus far and that are for different purposes, so you can pick the one most appropriate for you. Be aware that the list is probably already outdated when I post the article, so if you could help me keeping it up-to-date with comments, that would be great. 🙂

Before I dig into the libraries, let me explain how fallback works with <video>.

Generally, if you’re using the HTML5 <video> element, your fallback mechanism for browsers that do not support <video> is the HTML code your write inside the <video> element. A browser that supports the <video> element will not interpret the content, while all other browsers will:


<video src="video.ogv" controls>
Your browser does not support the HTML5 video element.
</video>

To do more than just text, you could provide a video fallback option. There are basically two options: you can fall back to a Flash solution:


<video src="video.ogv" controls>
<object width="320" height="240">
<param name="movie" value="video.swf">
<embed src="video.swf" width="320" height="240">
</embed>
</object>
</video>

or if you are using Ogg Theora and don’t want to create a video in a different format, you can fall back to using the java player called cortado:


<video src="video.ogv" controls width="320" height="240">
<applet code="com.fluendo.player.Cortado.class" archive="http://theora.org/cortado.jar" width="320" height="240">
<param name="url" value="video.ogv"/>
</applet>
</video>

Now, even if your browser support’s the <video> element, it may not be able to play the video format of your choice. For example, Firefox and Opera only support Ogg Theora, while Safari/Webkit supports MPEG4 and other codecs that the QuickTime framework supports, and Chrome supports both Ogg Theora and MPEG4. For this situation, the <video> element has an in-built selection mechanism: you do not put a “src” attribute onto the <video> element, but rather include <source> elements inside <video> which the browser will try one after the other until it finds one it plays:


<video controls width="320" height="240">
<source src="video.ogv" type="video/ogg" />
<source src="video.mp4" type="video/mp4" />
</video>

You can of course combine all the methods above to optimise the experience for your users, which is what has been done in this and this (Video For Everybody) example without the use of javascript. I actually like these approaches best and you may want to check them out before you consider using a javascript library.

But now, let’s look at the promised list of javascript libraries.

Firstly, let’s look at some libraries that let you support more than just one codec format. These allow you to provide video in the format most preferable by the given browser-mediaframework-OS combination. Note that you will need to encode and provide your videos in multiple formats for these to work.

  • mv_embed: this is probably the library that has been around the longest to provide &let;video> fallback mechanisms. It has evolved heaps over the last years and now supports Ogg Theora and Flash fallbacks.
  • several posts that demonstrate how to play flv files in a <video> tag.
  • html5flash: provides on top of the Ogg Theora and MPEG4 codec support also Flash support in the HTML5 video element through a chromeless Flash video player. It also exposes the <video> element’s javascript API to Flash content.
  • foxyvideo: provides a fallback flash player and a JavaScript library for HTML5 video controls that also includes a nearly identical ActionScript implementation.

Finally, let’s look at some libraries that are only focused around Ogg Theora support in browsers:

  • Celt’s javascript: a minimal javascript that checks for native Ogg Theora <video> support and the VLC plugin, and falls back to Cortado if nothing else works.
  • stealthisfilm’s javascript: checks for native support, VLC, liboggplay, Totem, any other Ogg Theora player, and cortado as fallback.
  • Wikimedia’s javascript: checks for QuickTime, VLC, native, Totem, KMPlayer, Kaffeine and Mplayer support before falling back to Cortado support.

Open Video Conference Working Group: HTML5 and

At the recent Open Video Conference, I was asked to chair a working group on HTML5 and the <video> tag. Since the conference had attracted a large number of open media software developers as well as HTML5 <video> tag developers, it was a great group of people that were on the panel with me: Philip Jagenstedt from Opera, Jan Gerber from Xiph, Viktor Gal from Annodex, Michael Dale from Metavid, and Eric Carlson from Apple. This meant we had three browser vendors and their <video> tag developers present as well as two javascript library developers representing some of the largest content sites that are already using Ogg Theora/Vorbis with the <video> tag, plus myself looking into accessiblity for <video>.

The biggest topic around the <video> tag is of course the question of baseline codec: which codec can and should become the required codec for anyone implementing <video> tag support. Fortunately, this discussion was held during the panel just ahead of ours. Thus, our panel was able to focus on the achievements of the HTML5 video tag and implementations of it, as well as the challenges still ahead.

Unfortunately, the panel was cut short at the conference to only 30 min, so we ended up doing mostly demos of HTML5 video working in different browsers and doing cool things such as working with SVG.

The challenges that we identified and that are still ahead to solve are:

  • annotation support: closed captions, subtitles, time-aligned metadata, and their DOM exposure
  • track selection: how to select between alternate audio tracks, alternate annotation tracks, based on e.g. language, or accessibility requirements; what would the content negotiation protocol look like
  • how to support live streaming
  • how to support in-browser a/v capture
  • how to support live video communication (skype-style)
  • how to support video playlists
  • how to support basic video editing functionality
  • what would a decent media server for html5 video look like; what capabilities would it have

Here are the slides we made for the working group.

Download PDF: Open Video Conference: HML5 and video Panel

Video: Video of the session at archive.org