ginger's thoughts

Silvia's blog

HTML5 Video element discussions at TPAC meetings

Posted in Digital Media, open codecs, standards, video accessibility by silvia on the November 16th, 2009

Last week’s TPAC (2009 W3C Technical Plenary / Advisory Committee) meetings were my second time at a TPAC and I found myself becoming highly involved with the progress on accessibility on the HTML5 video element. There were in particular two meetings of high relevanct: the Video Accessibility workshop and Friday’s HTML5 breakout group on the video element.

HTML5 Video Accessibility Workshop

The week started on Sunday with the “HTML5 Video Accessibility workshop” at Stanford University, organised by John Foliot and Dave Singer. They brought together a substantial number of people all representing a variety of interest groups. Everyone got their chance to present their viewpoint – check out the minutes of the meeting for a complete transcript.

The list of people and their discussion topics were as follows:

Accessibility Experts

  • Janina Sajka, chair of WAI Protocols and Formats: represented the vision-impaired community and expressed requirements for a deeply controllable access interface to audio-visual content, preferably in a structured manner similar to DAISY.
  • Sally Cain, RNIB, Member of W3C PF group: expressed a deep need for audio descriptions, which are often overlooked besides captions.
  • Ken Harrenstien, Google: has worked on captioning support for video.google and YouTube and shared his experiences, e.g. http://www.youtube.com/watch?v=QRS8MkLhQmM, and automated translation.
  • Victor Tsaran, Yahoo! Accessibility Manager: joined for a short time out of interest.

Practicioners

  • John Foliot, professor at Stanford Uni: showed a captioning service that he set up at Stanford University to enable lecturers to publish more accessible video – it uses humans for transcription, but automated tools to time-align, and provides a Web interface to the staff.
  • Matt May, Adobe: shared what Adobe learnt about accessibility in Flash – in particular that an instream-only approach to captions was a naive approach and that external captions are much more flexible, extensible, and can fit into current workflows.
  • Frank Olivier, Microsoft: attended to listen and learn.

Technologists

  • Pierre-Antoine Champin from Liris (France), who was not able to attend, sent a video about their research work on media accessibility using automatic and manual annotation.
  • Hironobu Takagi, IBM Labs Tokyo, general chair for W4A: demonstrated a text-based audio description system combined with a high-quality, almost human-sounding speech synthesizer.
  • Dick Bulterman, Researcher at CWI in Amsterdam, co-chair of SYMM (group at W3C doing SMIL): reported on 14 years of experience with multimedia presentations and SMIL (slides) and the need to make temporal and spatial synchronisation explicit to be able to do the complex things.
  • Joakim Söderberg, researcher at Ericsson, co-chair of media annotation group at W3C: reported on W3C media annotations group work and wanted to find out whether there are a11y related attributes missing.
  • Felix Sasaki, University of Applied Sciences in Potsdam, W3C media annotations group member: Teaching metadata.
  • Eric Carlson, Apple, Engineering HTML5 media elements in Webkit: there to watch and make sure the specification is implementable.
  • James Craig, Accessibility Software Quality Engineer, Apple: is interested in universal design and to solve content selection.
  • Myself, as Mozilla’s video accessibility contractor: I discussed what requirements I had collected before going ahead and doing implementations (slides) – also showed my demos.

Standards Experts

  • Dave Singer, Apple, head of multimedia standards: interested in building up a framework that gets better accessibility over time, and presented on how media queries could be used to improve source selection (slides).
  • Michael Cooper, work as W3C WAI staff: needs to make sure W3C technology has accessibility baked in and existing solutions are re-used.
  • Marisa DeMeglio, DAISY Consortium developer: reported on Digital Talking Book standards, the challenges with video in DAISY 4, and understanding the possibilities with HTML5.
  • Ian Hickson, Google, HTML5 editor: gave an update on the state of accessibility in HTML5.
  • Chris Lilley, W3C Hypertext CG co-chair, CSS and SVG group, etc.: wanted to make sure that whatever solution is created for HTML will be usable in SVG, and was also keenly interested to solve the internationalisation question.
  • Philippe Le Hegaret, W3C representative for HTML, timed text, and video working groups: reported on work in Timed Text working group, including a demo of DFXP use in HTML5.
  • Charles McCathieNeville, Opera, in charge of standards: interested in i18n and use of accessibility methods across different technologies without reinventing wheels.
  • Geoff Freed, NCAM, joined online: keen to solve captions and audio description in HTML5 in a declarative way.
  • Judy Brewer, head of WAI at W3C: interested in how the options for accessible media affect the user, especially when there are multiple options. Keen to create a proper requirements document.
  • Doug Schepers, Team Contact for the SVG and WebApps Working Groups: video handling in HTML5 and SVG should be similar. Keen to re-use technology.

The workshop helped clarify some of the requirements and potential solutions. No concrete specification progress was made, but that was not the intention of the workshop. Rather, it started getting people talking and involved with the newly created HTML5 Accessibility Task Force. In the end, the creation of a requirements document for media accessibility was taken on board as one of the first tasks for the HTML5 Accessibility Task Force, which, I believe, is still looking for an editor for it.

TPAC HTML5 Video Breakout Group

The HTML Working Group at TPAC was run differently to the previous year: instead of having all topics discussed in the full group, breakout groups were organised in two parallel rooms which focused each on particular topics that still need resolution in HTML5. I proposed a breakout group on Video Accessibility and somebody else wanted to get an update on the baseline codec discussion on the HTML5 video element, so we turned it into a single Video breakout group.

I put together a list of the core issues that we had come across from Sunday and turned it into a an agenda. Here are some short notes on each topic that was discussed. A full transcript is available in the minutes.

  1. Baseline codecs

    The brief discussion about baseline codecs just stated the impasse that we are currently at wrt the lack of a baseline codec that would satisfy all the stated requirements for a baseline codec. There is work being done behind closed doors to move towards agreeable royalty-free codecs, but no progress to be reportable to date.

  2. Cue ranges

    It has become apparent that there is a need to bring back the functionality of cue ranges, in particular for things such as activating slide changes, pause and display ads, and captions for live video. Rather than callbacks, this time a declarative and event-based solution is envisaged. It will possibly get introduced as part of the solution for captions.

  3. Media Element Accessibility

    The discussion here focused on the means in which to provide caption/subtitle support and briefly touched upon audio descriptions. I collected the following requirements:

    • while we prefer textual sources for captions/subtitles, and burnt-in captions or bitmap overlays (the DVD format) are not ideal, they should still be possible to be displayed; since most bitmaps are in gif, png or bmp format, that should not be a major issue
    • we need a declarative syntax for captions/subtitles, both for in-band and external files
    • if there are “conflicting” captions/subtitles available from in-band and external files, the in-band one would probably be displayed by default, but all available tracks should be user selectable
    • we need a default presentation, but also a JavaScript API to allow custom display
    • to deal with cross-site scripting issues for external files, CORS is probably a good solution
    • baseline codecs for captions/subtitles should be DFXP, srt and probably smpteTText – a new DFXP based SMPTE standard for timed text; it was further suggested to regard srt simply as a trivial subpart of DFXP functionality
    • the default presentation has to take into account authoring requirements, user preferences, and allow for interactive override – it seems we need to define new user preferences for video a11y aside from simply the browser language settings
    • the idea of using text and ARIA live regions to display audio descriptions was welcomed as a useful modern means to provide a11y to vision-impaired – it provides choice between screen reader and braille use and also improves searchability and has further advantages of being automatically processable
    • however, there continues to be a need to make human-created audio descriptions available, in particular for high-quality recordings (Shakespeare was used as an example)
    • futher, the HTML5 video element has not yet clarified how multiple encodings e.g. for different devices/bitrates, should be made available – this also ties into making sign language video tracks in different sign languages available
    • content selection could be done using media queries – needs further experimentation

Unfortunately, there was not enough time to address the remaining topics on the agenda: the need for hierarchical navigation through audio/video elements, and the handling of multi-track video. I am sure we will get back to them in due time.

Well, the meetings have certainly widened my understanding of the issues that we are currently dealing with around the audio and video elements – in particular I believe that once we solve how to deal with multiple alternative representations of the original media file, we will also solve many of the accessibility issues. It may, however, happen that we create the accessibility solutions first and thus also solve the issues of alternate representations. I have a lot to think about.

7 Responses to 'HTML5 Video element discussions at TPAC meetings'

Subscribe to comments with RSS or TrackBack to 'HTML5 Video element discussions at TPAC meetings'.


  1. on November 16th, 2009 at 10:35 am

    [...] http://blog.gingertech.net/2009/11/16/html5-video-element-discussions-at-tpac-meetings/ a few seconds ago from web [...]


  2. on November 16th, 2009 at 11:01 pm

    Thanks for the writeup, Silvia!

    I’m very interested in how the baseline codec discussion plays out. Hardware support will be quick to follow any industry wide adoption. Currently, video encoding is the most time intensive process in my software, StatEasy. I’d love to have a camera record OGG video natively. At the very least, a hardware accelerator would be helpful.

    I’m very much in favor of a JavaScript/CSS method to customize display of any captions.

    If FOMS wasn’t so far away from Pittsburgh, PA I’d be sure to attend! Looking forward to your writeup of that event in 2010!

    Mike Ressler


  3. on November 18th, 2009 at 8:01 pm

    Thanks for this nice summary.

    Just to clarify a few things :
    - my name is not Champain, but Champin
    - my video presentation was about ACAV, a project that we (LIRIS) are just starting with Eurecom (frenche graduate school) and Dailymotion. It is about media accessibility using automatic and manual annotation. The speaker diarization is therefore just of part of it, and is not done by us, but by Eurecom.

    No offence taken for these mistakes, of cours, but I didn’t want to give the impression that my presentation was denying the importance of our other partners :)

  4. silvia said,

    on November 18th, 2009 at 8:55 pm

    Hi Pierre,

    I apologize for the misspellings – I took them from the transcript and didn’t know better. Also I apologize for mis-representing the topic of your presentation – will correct that, too.


  5. on November 19th, 2009 at 9:30 am

    [...] Silvia Pfeiffer who has blogged on the HTML5 Video element discussions at TPAC meetings [...]


  6. on November 20th, 2009 at 8:56 pm

    Thanks for the good writeup Silvia!

    Just to complement, I understand from the minutes that follow-up work of this workshop will happen within the HTML Accessibility Task Force Work http://www.w3.org/WAI/PF/html-task-force. There is a public archived mailing list people can subscribe to: http://lists.w3.org/Archives/Public/public-html-a11y/

  7. silvia said,

    on November 20th, 2009 at 10:29 pm

    Thanks, Raphael – also, the HTML Accessibility Task Force wiki is at http://www.w3.org/WAI/PF/HTML/wiki/Main_Page .

Leave a Reply