Category Archives: code

Use deck.js as a remote presentation tool

January 8, 2014code, Digital Media, html5, LCA, Open Source, WebRTCHTML5, presentation, rtc.io, video, webrtc, websocketssilvia

deck.js is one of the new HTML5-based presentation tools. It’s simple to use, in particular for your basic, every-day presentation needs. You can also create more complex slides with animations etc. if you know your HTML and CSS.

Yesterday at linux.conf.au (LCA), I gave a presentation using deck.js. But I didn’t give it from the lectern in the room in Perth where LCA is being held – instead I gave it from the comfort of my home office at the other end of the country.

I used my laptop with in-built webcam and my Chrome browser to give this presentation. Beforehand, I had uploaded the presentation to a Web server and shared the link with the organiser of my speaker track, who was on site in Perth and had set up his laptop in the same fashion as myself. His screen was projecting the Chrome tab in which my slides were loaded and he had hooked up the audio output of his laptop to the room speaker system. His camera was pointed at the audience so I could see their reaction.

I loaded a slide master URL:
http://html5videoguide.net/presentations/lca_2014_webrtc/?master
and the room loaded the URL without query string:
http://html5videoguide.net/presentations/lca_2014_webrtc/.

Then I gave my talk exactly as I would if I was in the same room. Yes, it felt exactly as though I was there, including nervousness and audience feedback.

How did we do that? WebRTC (Web Real-time Communication) to the rescue, of course!

We used one of the modules of the rtc.io project called rtc-glue to add the video conferencing functionality and the slide navigation to deck.js. It was actually really really simple!

Here are the few things we added to deck.js to make it work:

Code added to index.html to make the video connection work:

<meta name="rtc-signalhost" content="http://rtc.io/switchboard/">
<meta name="rtc-room" content="lca2014">
...
<video id="localV" rtc-capture="camera" muted></video>
<video id="peerV" rtc-peer rtc-stream="localV"></video>
...
<script src="glue.js"></script>
<script>
glue.config.iceServers = [{ url: 'stun:stun.l.google.com:19302' }];
</script>

The iceServers config is required to punch through firewalls – you may also need a TURN server. Note that you need a signalling server – in our case we used http://rtc.io/switchboard/, which runs the code from rtc-switchboard.

Added glue.js library to deck.js:
Downloaded from https://raw.github.com/rtc-io/rtc-glue/master/dist/glue.js into the source directory of deck.js.

Code added to index.html to synchronize slide navigation:

glue.events.once('connected', function(signaller) {
  if (location.search.slice(1) !== '') {
    $(document).bind('deck.change', function(evt, from, to) {
      signaller.send('/slide', {
        idx: to,
        sender: signaller.id
      });
    });
  }
  signaller.on('slide', function(data) {
    console.log('received notification to change to slide: ', data.idx);
    $.deck('go', data.idx);
  });
});

This simply registers a callback on the slide master end to send a slide position message to the room end, and a callback on the room end that initiates the slide navigation.

And that’s it!

You can find my slide deck on GitHub.

Feel free to write your own slides in this manner – I would love to have more users of this approach. It should also be fairly simple to extend this to share pointer positions, so you can actually use the mouse pointer to point to things on your slides remotely. Would love to hear your experiences!

Note that the slides are actually a talk about the rtc.io project, so if you want to find out more about these modules and what other things you can do, read the slide deck or watch the talk when it has been published by LCA.

Many thanks to Damon Oehlman for his help in getting this working.

BTW: somebody should really fix that print style sheet for deck.js – I’m only ever getting the one slide that is currently showing. 😉

Video Conferencing in HTML5: WebRTC via Socket.io

February 6, 2013code, Digital Media, open codecs, Open Source, standards, Videos, WebRTChtml5 media, lca2013, linux.conf.au, webrtcsilvia

Six months ago I experimented with Web sockets for WebRTC and the early implementations of PeerConnection in Chrome. Last week I gave a presentation about WebRTC at Linux.conf.au, so it was time to update that codebase.

I decided to use socket.io for the signalling following the idea of Luc, which made the server code even smaller and reduced it to a mere reflector:

 var app = require('http').createServer().listen(1337);
 var io = require('socket.io').listen(app);

 io.sockets.on('connection', function(socket) {
         socket.on('message', function(message) {
         socket.broadcast.emit('message', message);
     });
 });

Then I turned to the client code. I was surprised to see the massive changes that PeerConnection has gone through. Check out my slide deck to see the different components that are now necessary to create a PeerConnection.

I was particularly surprised to see the SDP object now fully exposed to JavaScript and thus the ability to manipulate it directly rather than through some API. This allows Web developers to manipulate the type of session that they are asking the browsers to set up. I can imaging e.g. if they have support for a video codec in JavaScript that the browser does not provide built-in, they can add that codec to the set of choices to be offered to the peer. While it is flexible, I am concerned if this might create more problems than it solves. I guess we’ll have to wait and see.

I was also surprised by the need to use ICE, even though in my experiment I got away with an empty list of ICE servers – the ICE messages just got exchanged through the socket.io server. I am not sure whether this is a bug, but I was very happy about it because it meant I could run the whole demo on a completely separate network from the Internet.

The most exciting news since my talk is that Mozilla and Google have managed to get a PeerConnection working between Firefox and Chrome – this is the first cross-browser video conference call without a plugin! The code differences are minor.

Since the specification of the WebRTC API and of the MediaStream API are now official Working Drafts at the W3C, I expect other browsers will follow. I am also looking forward to the possibilities of:

multi-peer video conferencing like the efforts around webrtc.io,
the media stream recording API,
and the peer-to-peer data API.

The best places to learn about the latest possibilities of WebRTC are webrtc.org and the W3C WebRTC WG. code.google.com has open source code that continues to be updated to the latest released and interoperable features in browsers.

The video of my talk is in the process of being published. There is a MP4 version on the Linux Australia mirror server, but I expect it will be published properly soon. I will update the blog post when that happens.

Video Conferencing in HTML5: WebRTC via Web Sockets

June 4, 2012code, Digital Media, open codecs, Open Source, standards, VideosHTML5, open media software, videosilvia

A bit over a week ago I gave a presentation at Web Directions Code 2012 in Melbourne. Maxine and John asked me to speak about something related to HTML5 video, so I went for the new shiny: WebRTC – real-time communication in the browser.

Presentation slides

I only had 20 min, so I had to make it tight. I wanted to show off video conferencing without special plugins in Google Chrome in just a few lines of code, as is the promise of WebRTC. To a large extent, I achieved this. But I made some interesting discoveries along the way. Demos are in the slide deck.

UPDATE: Opera 12 has been released with WebRTC support.

Housekeeping: if you want to replicate what I have done, you need to install a Google Chrome Web Browser 19+. Then make sure you go to chrome://flags and activate the MediaStream and PeerConnection experiment(s). Restart your browser and now you can experiment with this feature. Big warning up-front: it’s not production-ready, since there are still changes happening to the spec and there is no compatible implementation by another browser yet.

Here is a brief summary of the steps involved to set up video conferencing in your browser:

Set up a video element each for the local and the remote video stream.
Grab the local camera and stream it to the first video element.
(*) Establish a connection to another person running the same Web page.
Send the local camera stream on that peer connection.
Accept the remote camera stream into the second video element.

Now, the most difficult part of all of this – believe it or not – is the signalling part that is required to build the peer connection (marked with (*)). Initially I wanted to run completely without a server and just enter the remote’s IP address to establish the connection. This is, however, not a functionality that the PeerConnection object provides [might this be something to add to the spec?].

So, you need a server known to both parties that can provide for the handshake to set up the connection. All the examples that I have seen, such as https://apprtc.appspot.com/, use a channel management server on Google’s appengine. I wanted it all working with HTML5 technology, so I decided to use a Web Socket server instead.

I implemented my Web Socket server using node.js (code of websocket server). The video conferencing demo is in the slide deck in an iframe – you can also use the stand-alone html page. Works like a treat.

While it is still using Google’s STUN server to get through NAT, the messaging for setting up the connection is running completely through the Web Socket server. The messages that get exchanged are plain SDP message packets with a session ID. There are OFFER, ANSWER, and OK packets exchanged for each streaming direction. You can see some of it in the below image:

WebRTC demo

I’m not running a public WebSocket server, so you won’t be able to see this part of the presentation working. But the local loopback video should work.

At the conference, it all went without a hitch (while the wireless played along). I believe you have to host the WebSocket server on the same machine as the Web page, otherwise it won’t work for security reasons.

A whole new world of opportunities lies out there when we get the ability to set up video conferencing on every Web page – scary and exciting at the same time!

A systematic approach to making Web Applications accessible

February 14, 2012accessibility, code, standardsaccessibility, aria, W3C, web applicationssilvia

With the latest developments in HTML5 and the still fairly new ARIA (Accessible Rich Interface Applications) attributes introduced by the W3C WAI (Web Accessibility Initiative), browsers have now implemented many features that allow you to make your JavaScript-heavy Web applications accessible.

Since I began working on making a complex web application accessible just over a year ago, I discovered that there was no step-by-step guide to approaching the changes necessary for creating an accessible Web application. Therefore, many people believe that it is still hard, if not impossible, to make Web applications accessible. In fact, it can be approached systematically, as this article will describe.

This post is based on a talk that Alice Boxhall and I gave at the recent Linux.conf.au titled “Developing accessible Web apps – how hard can it be?” (slides, video), which in turn was based on a Google Developer Day talk by Rachel Shearer (slides).

These talks, and this article, introduce a process that you can follow to make your Web applications accessible: each step will take you closer to having an application that can be accessed using a keyboard alone, and by users of screenreaders and other accessibility technology (AT).

The recommendations here only roughly conform to the requirements of WCAG (Web Content Accessibility Guidelines), which is the basis of legal accessibility requirements in many jurisdictions. The steps in this article may or may not be sufficient to meet a legal requirement. It is focused on the practical outcome of ensuring users with disabilities can use your Web application.

Step-by-step Approach

The steps to follow to make your Web apps accessible are as follows:

Use native HTML tags wherever possible
Make interactive elements keyboard accessible
Provide extra markup for AT (accessibility technology)

If you are a total newcomer to accessibility, I highly recommend installing a screenreader and just trying to read/navigate some Web pages. On Windows you can install the free NVDA screenreader, on Mac you can activate the pre-installed VoiceOver screenreader, on Linux you can use Orca, and if you just want a browser plugin for Chrome try installing ChromeVox.

1. Use native HTML tags

As you implement your Web application with interactive controls, try to use as many native HTML tags as possible.

HTML5 provides a rich set of elements which can be used to both add functionality and provide semantic context to your page. HTML4 already included many useful interactive controls, like <a>, <button>, <input> and <select>, and semantic landmark elements like <h1>. HTML5 adds richer <input> controls, and a more sophisticated set of semantic markup elements like such as <time>, <progress>, <meter>, <nav>, <header>, <article> and <aside>. (Note: check browser support for browser support of the new tags).

Using as much of the rich HTML5 markup as possible means that you get all of the accessibility features which have been implemented in the browser for those elements, such as keyboard support, short-cut keys and accessibility metadata, for free. For generic tags you have to implement them completely from scratch.

What exactly do you miss out on when you use a generic tag such as <div> over a specific semantic one such as <button>?

Generic tags are not focusable. That means you cannot reach them through using the [tab] on the keyboard.
You cannot activate them with the space bar or enter key or perform any other keyboard interaction that would be regarded as typical with such a control.
Since the role that the control represents is not specified in code but is only exposed through your custom visual styling, screenreaders cannot express to their users what type of control it is, e.g. button or link.
Neither can screenreaders add the control to the list of controls on the page that are of a certain type, e.g. to navigate to all headers of a certain level on the page.
And finally you need to manually style the element in order for it to look distinctive compared to other elements on the page; using a default control will allow the browser to provide the default style for the platform, which you can still override using CSS if you want.

Example:

Compare these two buttons. The first one is implemented using a <div> tag, the second one using a <button> tag. Try using a screenreader to experience the difference.

.custombutton {
cursor: pointer;
border: 1px solid #000;
background-color: #F6F6F6;
padding: 2px 5px;
}

Send

<style>
 .custombutton {
  cursor: pointer;
  border: 1px solid #000;
  background-color: #F6F6F6;
  display: inline-block;
  padding: 2px 5px;
}
</style>
<div class="custombutton" onclick="alert('sent!')">
  Send
</div>

<button onclick="alert('sent!')">
Send
</button>

2. Make interactive elements keyboard accessible

Many sophisticated web applications have some interactive controls that just have no appropriate HTML tag equivalent. In this case, you will have had to build an interactive element with JavaScript and <div> and/or <span> tags and lots of custom styling. The good news is, it’s possible to make even these custom controls accessible, and as a side benefit you will also make your application smoother to use for power users.

The first thing you can do to test usability of your control, or your Web app, is to unplug the mouse and try to use only the [TAB] and [ENTER] keys to interact with your application.

Try the following:

Can you reach all interactive elements with [TAB]?
Can you activate interactive elements with [ENTER] (or [SPACE])?
Are the elements in the right tab order?
After interaction: is the right element in focus?
Is there a keyboard shortcut that activates the element (accesskey)?

No? Let’s fix it.

2.1. Reaching interactive elements

If you have an element on your page that cannot be reached with [TAB], put a @tabindex attribute on it.

Example:

Here we have a <span> tag that works as a link (don’t do this – it’s just a simple example). The first one cannot be reached using [TAB] but the second one has a tabindex and is thus part of the tab order of the HTML page.

(Note: since we experiment lots with the tabindex in this article, to avoid confusion, click on some text in this paragraph and then hit the [TAB] key to see where it goes next. The click will set your keyboard focus in the DOM.)

.customlink {
text-decoration: underline;
cursor: pointer;
}

Click

<style>
.customlink {
  text-decoration: underline;
  cursor: pointer;
}
</style>
<span class="customlink" onclick="alert('activated!')">
Click
</span>

Click

<style>
.customlink {
  text-decoration: underline;
  cursor: pointer;
}
</style>
<span class="customlink" onclick="alert('activated!')" tabindex="0">
Click
</span>

You set @tabindex=0 to add an element into the native tab order of the page, which is the DOM order.

2.2. Activating interactive elements

Next, you typically want to be able to use the [ENTER] and [SPACE] keys to activate your custom control. To do so, you will need to implement an onkeydown event handler. Note that the keyCode for [ENTER] is 13 and for [SPACE] is 32.

Example:

Let’s add this functionality to the <span> tag from before. Try tabbing to it and hit the [ENTER] or [SPACE] key.

Click

<span class="customlink" onclick="alert('activated!')" tabindex="0">
Click
</span>

function handlekey(event) {
var target = event.target || event.srcElement;
if (event.keyCode == 13 || event.keyCode == 32) { target.onclick(); }
}

Click

<span class="customlink" onclick="alert('activated!')" tabindex="0"
      onkeydown="handlekey(event);">
Click
</span>
<script>
function handlekey(event) {
  var target = event.target || event.srcElement;
  if (event.keyCode == 13 || event.keyCode == 32) {
    target.onclick();
  }
}
</script>

Note that there are some controls that might need support for keys other than [tab] or [enter] to be able to use them from the keyboard alone, for example a custom list box, menu or slider should respond to arrow keys.

2.3. Elements in the right tab order

Have you tried tabbing to all the elements on your page that you care about? If so, check if the order of tab stops seems right. The default order is given by the order in which interactive elements appear in the DOM. For example, if your page’s code has a right column that is coded before the main article, then the links in the right column will receive tab focus first before the links in the main article.

You could change this by re-ordering your DOM, but oftentimes this is not possible. So, instead give the elements that should be the first ones to receive tab focus a positive @tabindex. The tab access will start at the smallest non-zero @tabindex value. If multiple elements share the same @tabindex value, these controls receive tab focus in DOM order. After that, interactive elements and those with @tabindex=0 will receive tab focus in DOM order.

Example:

The one thing that always annoys me the most is if the tab order in forms that I am supposed to fill in is illogical. Here is an example where the first and last name are separated by the address because they are in a table. We could fix it by moving to a <div> based layout, but let’s use @tabindex to demonstrate the change.

.customtabs input {
width: 50px;
}

Firstname:	Address:
Lastname:	City:

<table class="customtabs">
  <tr>
    <td>Firstname:
      <input type="text" id="firstname">
    </td>
    <td>Address:
      <input type="text" id="address">
    </td>
  </tr>
  <tr>
    <td>Lastname: 
      <input type="text" id="lastname">
    </td>
    <td>City:
      <input type="text" id="city">
    </td>
  </tr>
</table>

Click here to test this form,
then [TAB]:

Firstname:	Address:
Lastname:	City:

<table class="customtabs">
  <tr>
    <td>Firstname:
      <input type="text" id="firstname" tabindex="10">
    </td>
    <td>Address:
      <input type="text" id="address" tabindex="30">
    </td>
  </tr>
  <tr>
    <td>Lastname:
      <input type="text" id="lastname" tabindex="20">
    </td>
    <td>City:
      <input type="text" id="city" tabindex="40">
    </td>
  </tr>
</table>

Be very careful with using non-zero tabindex values. Since they change the tab order on the page, you may get side effects that you might not have intended, such as having to give other elements on the page a non-zero tabindex value to avoid skipping too many other elements as I would need to do here.

2.4. Focus on the right element

Some of the controls that you create may be rather complex and open elements on the page that were previously hidden. This is particularly the case for drop-downs, pop-ups, and menus in general. Oftentimes the hidden element is not defined in the DOM right after the interactive control, such that a [TAB] will not put your keyboard focus on the next element that you are interacting with.

The solution is to manage your keyboard focus from JavaScript using the .focus() method.

Example:

Here is a menu that is declared ahead of the menu button. If you tab onto the button and hit enter, the menu is revealed. But your tab focus is still on the menu button, so your next [TAB] will take you somewhere else. We fix it by setting the focus on the first menu item after opening the menu.

#custommenu {
background-color:#777;
padding: 3px;
border:1px solid #666;
}
.squarebuttons button {
border: 1px solid black;
}

function displayMenu(value) {
document.getElementById(“custommenu”).style.display=value;
}

<div id="custommenu" style="display:none;">
  <button id="item1" onclick="displayMenu('none');">Menu item1</button>
  <button id="item2" onclick="displayMenu('none');">Menu item2</button>
</div>
<button onclick="displayMenu('block');">Menu</button>
<script>
function displayMenu(value) {
 document.getElementById("custommenu").style.display=value;
}
</script>

#custommenu2 {
background-color:#777;
padding: 3px;
border:1px solid #666;
}

function displayMenu2(value) {
document.getElementById(“custommenu2”).style.display=value;
document.getElementById(“item1”).focus();
}

<div id="custommenu" style="display:none;">
  <button id="item1" onclick="displayMenu('none');">Menu item1</button>
  <button id="item2" onclick="displayMenu('none');">Menu item2</button>
</div>
<button onclick="displayMenu('block');">Menu</button>
<script>
function displayMenu(value) {
 document.getElementById("custommenu").style.display=value;
 document.getElementById("item1").focus();
}
</script>

You will notice that there are still some things you can improve on here. For example, after you close the menu again with one of the menu items, the focus does not move back onto the menu button.

Also, after opening the menu, you may prefer not to move the focus onto the first menu item but rather just onto the menu <div>. You can do so by giving that div a @tabindex and then calling .focus() on it. If you do not want to make the div part of the normal tabbing order, just give it a @tabindex=-1 value. This will allow your div to receive focus from script, but be exempt from accidental tabbing onto (though usually you just want to use @tabindex=0).

Bonus: If you want to help keyboard users even more, you can also put outlines on the element that is currently in focus using CSS”s outline property. If you want to avoid the outlines for mouse users, you can dynamically add a class that removes the outline in mouseover events but leaves it for :focus.

2.5. Provide sensible keyboard shortcuts

At this stage your application is actually keyboard accessible. Congratulations!

However, it’s still not very efficient: like power-users, screenreader users love keyboard shortcuts: can you imagine if you were forced to tab through an entire page, or navigate back to a menu tree at the top of the page, to reach each control you were interested in? And, obviously, anything which makes navigating the app via the keyboard more efficient for screenreader users will benefit all power users as well, like the ubiquitous keyboard shortcuts for cut, copy and paste.

HTML4 introduced so-called accesskeys for this. In HTML5 @accesskey is now allowed on all elements.

The @accesskey attribute takes the value of a keyboard key (e.g. @accesskey="x") and is activated through platform- and browser-specific activation keys. For example, on the Mac it’s generally the [Ctrl] key, in IE it’ the [Alt] key, in Firefox on Windows [Shift]-[Alt], and in Opera on Windows [Shift]-[ESC]. You press the activation key and the accesskey together which either activates or focuses the element with the @accesskey attribute.

Example:

var button = document.getElementById(‘accessbutton’);
if (button.accessKeyLabel) {
button.innerHTML += ‘ (‘ + button.accessKeyLabel + ‘)’;
}

<button id="accessbutton" onclick="alert('sent!')" accesskey="e">
Send
</button>
<script>
  var button = document.getElementById('accessbutton');
  if (button.accessKeyLabel) {
    button.innerHTML += ' (' + button.accessKeyLabel + ')';
  }
</script>

Now, the idea behind this is clever, but the execution is pretty poor. Firstly, the different activation keys between different platforms and browsers make it really hard for people to get used to the accesskeys. Secondly, the key combinations can conflict with browser and screenreader shortcut keys, the first of which will render browser shortcuts unusable and the second will effectively remove the accesskeys.

In the end it is up to the Web application developer whether to use the accesskey attribute or whether to implement explicit shortcut keys for the application through key event handlers on the window object. In either case, make sure to provide a help list for your shortcut keys.

Also note that a page with a really good hierarchical heading layout and use of ARIA landmarks can help to eliminate the need for accesskeys to jump around the page, since there are typically default navigations available in screen readers to jump directly to headings, hyperlinks, and ARIA landmarks.

3. Provide markup for AT

Having made the application keyboard accessible also has advantages for screenreaders, since they can now reach the controls individually and activate them. So, next we will use a screenreader and close our eyes to find out where we only provide visual cues to understand the necessary interaction.

Here are some of the issues to consider:

Role may need to get identified
States may need to be kept track of
Properties may need to be made explicit
Labels may need to be provided for elements

This is where the W3C’s ARIA (Accessible Rich Internet Applications) standard comes in. ARIA attributes provide semantic information to screen readers and other AT that is otherwise conveyed only visually.

Note that using ARIA does not automatically implement the standard widget behavior – you’ll still need to add focus management, keyboard navigation, and change aria attribute values in script.

3.1. ARIA roles

After implementing a custom interactive widget, you need to add a @role attribute to indicate what type of controls it is, e.g. that it is playing the role of a standard tag such as a button.

Example:

This menu button is implemented as a <div>, but with a role of “button” it is announced as a button by a screenreader.

Menu	<div tabindex="0" role="button">Menu</div>

ARIA roles also describe composite controls that do not have a native HTML equivalent.

Example:

This menu with menu items is implemented as a set of <div> tags, but with a role of “menu” and “menuitem” items.

Cut

Copy

Paste

<div role="menu">
  <div tabindex="0" role="menuitem">Cut</div>
  <div tabindex="0" role="menuitem">Copy</div>
  <div tabindex="0" role="menuitem">Paste</div>
</div>

3.2. ARIA states

Some interactive controls represent different states, e.g. a checkbox can be checked or unchecked, or a menu can be expanded or collapsed.

Example:

The following menu has states on the menu items, which are here not just used to give an aural indication through the screenreader, but also a visual one through CSS.

.custombutton:before {
content: “”;
}
.custombutton[aria-checked=true]:before {
content: “2713 “;
}

Left

Center

Right

<style>
.custombutton[aria-checked=true]:before {
   content:  "2713 ";
}
</style>
<div role="menu">
  <div tabindex="0" role="menuitem" aria-checked="true">Left</div>
  <div tabindex="0" role="menuitem" aria-checked="false">Center</div>
  <div tabindex="0" role="menuitem" aria-checked="false">Right</div>
</div>

3.3. ARIA properties

Some of the functionality of interactive controls cannot be captured by the role attribute alone. We have ARIA properties to add features that the screenreader needs to announce, such as aria-label, aria-haspopup, aria-activedescendant, or aria-live.

Example:

The following drop-down menu uses aria-haspopup to tell the screenreader that there is a popup hidden behind the menu button together with an ARIA state of aria-expanded to track whether it’s open or closed.

.menu {
border: 1px solid black;
}
.menuitem:hover {
background: grey;
}
.menuitem[aria-checked=true]:before {
content: “2713 “;
}

Justify

<div class="custombutton" id="button" tabindex="0" role="button"
   aria-expanded="false" aria-haspopup="true">
    <span>Justify</span>
</div>
<div role="menu"  class="menu" id="menu" style="display: none;">
  <div tabindex="0" role="menuitem" class="menuitem" aria-checked="true">
    Left
  </div>
  <div tabindex="0" role="menuitem" class="menuitem" aria-checked="false">
    Center
  </div>
  <div tabindex="0" role="menuitem" class="menuitem" aria-checked="false">
    Right
  </div>
</div>
[CSS and JavaScript for example omitted]

3.4. Labelling

The main issue that people know about accessibility seems to be that they have to put alt text onto images. This is only one means to provide labels to screenreaders for page content. Labels are short informative pieces of text that provide a name to a control.

There are actually several ways of providing labels for controls:

on img elements use @alt
on input elements use the label element
use @aria-labelledby if there is another element that contains the label
use @title if you also want a label to be used as a tooltip
otherwise use @aria-label

I’ll provide examples for the first two use cases – the other use cases are simple to deduce.

Example:

The following two images show the rough concept for providing alt text for images: images that provide information should be transcribed, images that are just decorative should receive an empty @alt attribute.

shocked lolcat titled 'HTML cannot do that!

Image by Noah Sussman

<img src="texture.jpg" alt="">
<img src="lolcat.jpg"
alt="shocked lolcat titled 'HTML cannot do that!">
<img src="texture.jpg" alt="">

When marking up decorative images with an empty @alt attribute, the image is actually completely removed from the accessibility tree and does not confuse the blind user. This is a desired effect, so do remember to mark up all your images with @alt attributes, even those that don’t contain anything of interest to AT.

Example:

In the example form above in Section 2.3, when tabbing directly on the input elements, the screen reader will only say “edit text” without announcing what meaning that text has. That’s not very useful. So let’s introduce a label element for the input elements. We’ll also add checkboxes with a label.

Doctor title:

Firstname:
Lastname:
Address:
City:
Remember me:

<label>Doctor title:</label>
  <input type="checkbox" id="doctor"/>
<label>Firstname:</label>
  <input type="text" id="firstname2"/>

<label for="lastname2">Lastname:</label>
  <input type="text" id="lastname2"/>

<label>Address:
  <input type="text" id="address2">
</label>
<label for="city2">City:
  <input type="text" id="city2">
</label>
<label for="remember">Remember me:</label>
  <input type="checkbox" id="remember">

In this example we use several different approaches to show what a different it makes to use the <label> element to mark up input boxes.

The first two fields just have a <label> element next to a <input> element. When using a screenreader you will not notice a difference between this and not using the <label> element because there is no connection between the <label> and the <input> element.

In the third field we use the @for attribute to create that link. Now the input field isn’t just announced as “edit text”, but rather as “Lastname edit text”, which is much more useful. Also, the screenreader can now skip the labels and get straight on the input element.

In the fourth and fifth field we actually encapsulate the <input> element inside the <label> element, thus avoiding the need for a @for attribute, though it doesn’t hurt to explicity add it.

Finally we look at the checkbox. By including a referenced <label> element with the checkbox, we change the screenreaders announcement from just “checkbox not checked” to “Remember me checkbox not checked”. Also notice that the click target now includes the label, making the checkbox not only more usable to screenreaders, but also for mouse users.

4. Conclusions

This article introduced a process that you can follow to make your Web applications accessible. As you do that, you will noticed that there are other things that you may need to do in order to give the best experience to a power user on a keyboard, a blind user using a screenreader, or a vision-impaired user using a screen magnifier. But once you’ve made a start, you will notice that it’s not all black magic and a lot can be achieved with just a little markup.

You will find more markup in the WAI ARIA specification and many more resources at Mozilla’s ARIA portal. Now go and change the world!

Many thanks to Alice Boxhall and Dominic Mazzoni for their proof-reading and suggested changes that really helped improve the article!

My first released WordPress plugin

May 25, 2010code, Digital Media, vquencedotsub, external videos, vimeo, vquence, wordpress plugin, YouTubesilvia

A screenshot of the gallery that the external video plugin creates

I’m pretty proud of this, which is why I’m dedicating a short blog post to it: today, John and I released my first WordPress plugin as open source to the WordPress plugins site.

It’s got the boring name “External Videos” and builds a bridge between your WordPress instance and videos of channels on a video hosting site – currently supported are YouTube, Vimeo, and DotSub.

It does this by using a brand-new feature to be introduced in WordPress 3: custom post types.

Check out the screenshots on the plugins page to see more – I’m unfortunately not yet running this Website with WordPress 3, so am not yet using this plugin’s features.

In the admin interface of WordPress, you enter the video channels that you want to pull videos from. Then it goes and pulls the videos with their metadata from these sites and creates video posts for them. That pulling is done once a day to update with new posts. The videos can be looked at in the admin interface under a separate video post section. They can be linked to WordPress posts and pages where the video may be discussed in context.

The video posts can be exposed on the WordPress site through a gallery, which is created by a short code, that can be added to any WordPress page. The gallery of thumbnails clicks through to an overlay with each video and its metadata as well as a link to the related WordPress post.

You can also add a widget to the side bar of the WordPress site with links to the most recent videos.

There are many more features that I want to develop for this plugin. I’d of course like to move it to HTML5 video instead of Adobe Flash. But for now I am happy with it.

I’d like to say thank you to John Ferlito, who helped with some of the coding, to Jeff Waugh for suggesting that it would best be developed using the new post types feature, and to Senator Kate Lundy and Pia Waugh at her office, who funded a part of the development. I am hoping they will find it useful to give their awesome collection of videos better exposure.

NOTE: you can post your issues with this plugin now to the wordpress forum at http://wordpress.org/tags/external-videos

W3C Media Annotations API standard

April 10, 2010code, Digital Media, open codecs, Open Source, standardsAPI, audio, HTML5, media annotations, media elements, media fragments, meta data, metadata, Ogg, skeleton, specification, video, vorbiscomment, W3Csilvia

Recently, I was asked to review the W3C Media Annotations specifications as they are about to go into Last Call (a state that comes before the request for implementations at the W3C).

The W3C Media Annotations group has defined a set of metadata that they believe is representative and common for media resources. The ontology consist of the following fields:

ma:identifier: a URI or string to identify a resource
ma:title: a string providing the title of the resource
ma:language: a language code describing the language used in the resource
ma:locator: the URI at which the resource can be accessed
ma:contributor: a URI or string identifying the contributor and the nature of the contribution
ma:creator: a URI or string identifying an author
ma:createDate: a date of creation or publication of the resource
ma:location: a string or geo code identifying where the resource has been shot/recorded
ma:description: a string describing the content of the resource
ma:keyword: a word or word combination providing a topic, keyword or tag representing the resource
ma:genre: a string providing the genre of the resource
ma:rating: rating value, including the rating scale
ma:relation: a URI and string identifying a related resource and the relationship
ma:collection: a URI or string providing the name of a collection to which the resource belongs
ma:copyright: a URI or string with the copyright statement.
ma:license: a string or URI with the usage license
ma:publisher: a string or URI with the publisher of the resource
ma:targetAudience: a URI and classification string providing the issuer of the classification and the classification value
ma:fragments: a list of string and URI values that identify media fragments and their type
ma:namedFragments: a list of string and URI values the provide names to media fragments
ma:frameSize: a width – height pair in pixels
ma:compression: a string providing the compression algorithm
ma:duration: a float to provide the resource duration in seconds
ma:format String: the mime type of the resource
ma:samplingrate: a float with the audio sampling rate
ma:framerate: a float with the video frame rate
ma:bitrate: a float providing the average bit rate in kbps
ma:numTracks: an int of the number of tracks

Note that some of these fields are not single values, but simple constructs of multiple values. Thus, they are actually more complex than name-value pairs that, e.g. are typically used in HTML meta headers or in Dublin Core. I regard this as an issue for implementations.

The fields were chosen as typical metadata being available about media resources. The media fragments fields are a bit dubious in this respect, but could be useful in future.

The metadata is determined either from within the resource itself or from a metadata collection about the resource. As such, the document maps several existing metadata and media resource formats to this interface, amongst them:

As they didn’t have a mapping table for Ogg content, I offered the following:

MAWG	Relation	Ogg properties	How to do the mapping	Datatype
Descriptive Properties (Core Set)
Identification
ma:identifier	exact	Name	Name field in skeleton header (new)	String
ma:title	exact	Title	TITLE field in vorbiscomment header	String
	exact	Title	Title field in skeleton header (new)	String
	related	Album	ALBUM title in vorbiscomment header	String
ma:language	exact	Language	Language field in skeleton header (new)	language code
ma:locator	exact		file URI from system	URI
Creation
ma:contributor	exact	Artist, Performer	ARTIST and PERFORMER vorbiscomment headers	Strings
ma:creator	related	Organization	ORGANIZATION field in vorbiscomment header
ma:createDate	exact	Date	DATE field in vorbiscomment header	ISO date format
ma:location	exact	Location	LOCATION field in vorbiscomment header	String
Content description
ma:description	exact	Description	DESCRIPTION field in vorbiscomment header	String
ma:keyword	N/A
ma:genre	exact	Genre	GENRE field in vorbiscomment header	String
ma:rating	N/A
Relational
ma:relation	related	Version, Tracknumber	VERSION (version of a title), TRACKNUMBER (CD track) fields in vorbiscomment header	Strings
ma:collection	related	Album	ALBUM field of vorbiscomment header	String
Rights
ma:copyright	exact	Copyright	COPYRIGHT field of vorbiscomment header	String
ma:license	exact	License	LICENSE field of vorbiscomment header	String
Distribution
ma:publisher	related	Organization	ORGNIZATION field of vorbiscomment header	String
ma:targetAudience	more specific	Role	Role field of Skeleton header (new)	String
Fragments
ma:fragments	N/A
ma:namedFragments	N/A
Technical Properties
ma:frameSize	exact		extract from binary header of video track	int, int (width x height)
ma:compression	exact	Content-type	Content-type field of Skeleton header	MIME type
ma:duration	exact		calculate as duration = last_sample_time – first_sample_time of OggIndex header of skeleton	Float (or rather: rational – rational)
ma:format	exact	Content-type	Content-type field of Skeleton header	MIME type
ma:samplingrate	exact		calculate as granulerate = granulerate_numerator / granulerate_denominator of Skeleton header	Rational (or rather int / int)
ma:framerate	exact		calculate as granulerate = granulerate_numerator / granulerate_denominator of Skeleton header	Rational (or rather int / int)
ma:bitrate	exact		calculate as bitrate = length_of_segment / duration from OggIndex headers of skeleton	Float
ma:numTracks	exact	Tracknumber	TRACKNUMBER field of vorbiscomment header (track number on album)	Int

You will notice that the table mentions 4 fields in skeleton with a “new” marker – they are actually proposed fields in skeleton – a bit of coding will be necessary to introduce them into software. The space for these fields already exists in message header fields, so it won’t require a change of the skeleton format.

In the second specification of the Media Annotations WG, the group offers a standard API to access (i.e. read) the defined fields. They also intend to create an API to write the fields, but I doubt that will be easy because of the vast amount of file types they intend to support.

There is basically a single function that allows the extraction of metadata:
MAObject[] getProperty(in DOMString propertyName, in optional DOMString sourceFormat, in optional DOMString subtype, in optional DOMString language, in optional DOMString fragment );

I proposed it may be possible to include this into HTML5 as follows:
interface HTMLMediaElement : HTMLElement { ... getter MAObject getProperty(in DOMString propertyName, in optional unsigned long trackIndex); ... }

This would either extract the property for a particular track in a media resource or for the complete resource if no track index is given. The only problem I see is that the returned object is different depending on the requested property – the MAObject is only a parent class for the returned object types. I am not sure it is therefore possible to specify this easily in HTML5.

Overall I thought the specification was a nice piece of work. I am not sure I agree with all the chosen fields, but that is always an issue with metadata. The most important fields are there and that’s what matters.

Accessibility support in Ogg and liboggplay

February 19, 2010code, Digital Media, FOMS, LCA, open codecs, Open Source, standards, video accessibilityaccessibility, HTML5, html5 media, Ogg, Ogg Theora, Ogg Theora/Vorbis, Ogg video, open codecs, open media software, Open Source, video accessibilitysilvia

At the recent FOMS/LCA in Wellington, New Zealand, we talked a lot about how Ogg could support accessibility. Technically, this means support for multiple text tracks (subtitles/captions), multiple audio tracks (audio descriptions parallel to main audio track), and multiple video tracks (sign language video parallel to main video track).

Creating multitrack Ogg files
The creation of multitrack Ogg files is already possible using one of the muxing applications, e.g. oggz-merge. For example, I have my own little collection of multitrack Ogg files at http://annodex.net/~silvia/itext/elephants_dream/multitrack/. But then you are stranded with files that no player will play back.

Multitrack Ogg in Players
As Ogg is now being used in multiple Web browsers in the new HTML5 media formats, there are in particular requirements for accessibility support for the hard-of-hearing and vision-impaired. Either multitrack Ogg needs to become more of a common case, or the association of external media files that provide synchronised accessibility data (captions, audio descriptions, sign language) to the main media file needs to become a standard in HTML5.

As it turn out, both these approaches are being considered and worked on in the W3C. Accessibility data that are audio or video tracks will in the near future have to come out of the media resource itself, but captions and other text tracks will also be available from external associated elements.

The availability of internal accessibility tracks in Ogg is a new use case – something Ogg has been ready to do, but has not gone into common usage. MPEG files on the other hand have for a long time been used with internal accessibility tracks and thus frameworks and players are in place to decode such tracks and do something sensible with them. This is not so much the case for Ogg.

For example, a current VLC build installed on Windows will display captions, because Ogg Kate support is activated. A current VLC build on any other platform, however, has Ogg Kate support deactivated in the build, so captions won’t display. This will hopefully change soon, but we have to look also beyond players and into media frameworks – in particular those that are being used by the browser vendors to provide Ogg support.

Multitrack Ogg in Browsers
Hopefully gstreamer (which is what Opera uses for Ogg support) and ffmpeg (which is what Chrome uses for Ogg support) will expose all available tracks to the browser so they can expose them to the user for turning on and off. Incidentally, a multitrack media JavaScript API is in development in the W3C HTML5 Accessibility Task Force for allowing such control.

The current version of Firefox uses liboggplay for Ogg support, but liboggplay’s multitrack support has been sketchy this far. So, Viktor Gal – the liboggplay maintainer – and I sat down at FOMS/LCA to discuss this and Viktor developed some patches to make the demo player in the liboggplay package, the glut-player, support the accessibility use cases.

I applied Viktor’s patch to my local copy of liboggplay and I am very excited to show you the screencast of glut-player playing back a video file with an audio description track and an English caption track all in sync:

elephants_dream_with_audiodescriptions_and_captions

Further developments
There are still important questions open: for example, how will a player know that an audio description track is to be played together with the main audio track, but a dub track (e.g. a German dub for an English video) is to be played as an alternative. Such metadata for the tracks is something that Ogg is still missing, but that Ogg can be extended with fairly easily through the use of the Skeleton track. It is something the Xiph community is now working on.

Summary
This is great progress towards accessibility support in Ogg and therefore in Web browsers. And there is more to come soon.

How to display seeked position for HTML5 video

February 18, 2010code, Digital Media, standardsaudio element, HTML5, html5 media, HTML5 video, html5 video tag, video element, W3Csilvia

Recently, I was asked for some help on coding with an HTML5 video element and its events. In particular the question was: how do I display the time position that somebody seeked to in a video?

Here is a code snipped that shows how to use the seeked event:

<video onseeked="writeVideoTime(this.currentTime);" src="video.ogv" controls></video> <p>position:</p><div id="videotime"></div> <script type="text/javascript"> // get video element var video = document.getElementsByTagName("video")[0]; function writeVideoTime(t) { document.getElementById("videotime").innerHTML=t; } </script>

Other events that can be used in a similar way are:

loadstart: UA requests the media data from the server
progress: UA is fetching media data from the server
suspend: UA is on purpose idling on the server connection mid-fetching
abort: UA aborts fetching media data from the server
error: UA aborts fetching media because of a network error
emptied: UA runs out of network buffered media data (I think)
stalled: UA is waiting for media data from the server
play: playback has begun after play() method returns
pause: playback has been paused after pause() method returns
loadedmetadata: UA has received all its setup information for the media resource, duration and dimensions and is ready to play
loadeddata: UA can render the media data at the current playback position for the first time
waiting: playback has stopped because the next frame is not available yet
playing: playback has started
canplay: playback can resume, but at risk of buffer underrun
canplaythrough: playback can resume without estimated risk of buffer underrun
seeking: seeking attribute changed to true (may be too short to catch)
seeked: seeking attribute changed to false
timeupdate: current playback position changed enough to report on it
ended: playback stopped at media resource end; ended attribute is true
ratechange: defaultPlaybackRate or playbackRate attribute have just changed
durationchange: duration attribute has changed
volumechange:volume attribute or the muted attribute has changed

Please refer to the actual event list in the specification for more details and more accurate information on the events.

Audio Track Accessibility for HTML5

February 12, 2010code, Digital Media, FOMS, open codecs, Open Source, standards, video accessibilityaccessibility, audio description, audio element, Firefox, html5 media, HTML5 video, multitrack audio, multitrack video, video element, W3Csilvia

I have talked a lot about synchronising multiple tracks of audio and video content recently. The reason was mainly that I foresee a need for more than two parallel audio and video tracks, such as audio descriptions for the vision-impaired or dub tracks for internationalisation, as well as sign language tracks for the hard-of-hearing.

It is almost impossible to introduce a good scheme to deliver the right video composition to a target audience. Common people will prefer bare a/v, vision-impaired would probably prefer only audio plus audio descriptions (but will probably take the video), and the hard-of-hearing will prefer video plus captions and possibly a sign language track . While it is possible to dynamically create files that contain such tracks on a server and then deliver the right composition, implementation of such a server method has not been very successful in the last years and it would likely take many years to roll out such new infrastructure.

So, the only other option we have is to synchronise completely separate media resource together as they are selected by the audience.

It is this need that this HTML5 accessibility demo is about: Check out the demo of multiple media resource synchronisation.

I created a Ogg video with only a video track (10m53s750). Then I created an audio track that is the original English audio track (10m53s696). Then I used a Spanish dub track that I found through BlenderNation as an alternative audio track (10m58s337). Lastly, I created an audio description track in the original language (10m53s706). This creates a video track with three optional audio tracks.

I took away all native controls from these elements when using the HTML5 audio and video tag and ran my own stop/play and seeking approaches, which handled all media elements in one go.

I was mostly interested in the quality of this experience. Would the different media files stay mostly in sync? They are normally decoded in different threads, so how big would the drift be?

The resulting page is the basis for such experiments with synchronisation.

The page prints the current playback position in all of the media files at a constant interval of 500ms. Note that when you pause and then play again, I am re-synching the audio tracks with the video track, but not when you just let the files play through.

I have let the files play through on my rather busy Macbook and have achieved the following interesting drift over the course of about 9 minutes:

You will see that the video was the slowest, only doing roughly 540s, while the Spanish dub did 560s in the same time.

To fix such drifts, you can always include regular re-synchronisation points into the video playback. For example, you could set a timeout on the playback to re-sync every 500ms. Within such a short time, it is almost impossible to notice a drift. Don’t re-load the video, because it will lead to visual artifacts. But do use the video’s currentTime to re-set the others. (UPDATE: Actually, it depends on your situation, which track is the best choice as the main timeline. See also comments below.)

It is a workable way of associating random numbers of media tracks with videos, in particular in situations where the creation of merged files cannot easily be included in a workflow.

Tutorial on HTML5 open video at LCA 2010

January 26, 2010code, Digital Media, LCA, open codecs, Open Source, standards, video accessibilityFirefox, HTML5 video, LCA, Ogg Theora, tutorial, video accessibilitysilvia

During last week’s LCA, Jan Gerber, Michael Dale and I gave a 3 hour tutorial on how to publish HTML5 video in an open format.

We basically taught people how to create and publish Ogg Theora video in HTML5 Web pages and how to make them work across browsers, including much of the available tools and libraries. We’re hoping that some people will have learnt enough to include modules in CMSes such as Drupal, Joomla and WordPress, which will easily support the publishing of Ogg Theora.

I have been asked to share the material that we used. It consists of:

Note that if you would like to walk through the exercises, you should install the following software beforehand:

oggz-tools
oggvideotools
apache2 or a Web server of your choice
ffmpeg2theora
firefox3.5+
firefogg plugin
firebug plugin
vlc, mplayer, totem or xine
kino or pitivi or another video editor that exports Theora, e.g. iMovie with XiphQT

You might need to look for packages of your favourite OS (e.g. Windows or Mac, Ubuntu or Debian).

The exercises include:

creating a Ogg video from an editor
transcoding a video using http://firefogg.org/
creating a poster image using OggThumb
writing a first HTML5 video Web page with Ogg Theora
publishing it on a Web Server, with correct MIME type & Duration hint
writing a second HTML5 video Web page with Ogg Theora & MP4 to cover Safari/Webkit
transcoding using ffmpeg2theora in a script
writing a third HTML5 video Web page with Cortado fallback
writing a fourth Web page using “Video for Everybody”
writing a fifth Web page using “mwEmbed”
writing a sixth Web page using firefogg for transcoding before upload
and a seventh one with a progress bar
encoding srt subtitles into an Ogg Kate track
writing an eighth Web page using cortado to display the Ogg Kate track

For those that would like to see the slides here immediately, a special flash embed:

Enjoy!

ginger's thoughts

Silvia's blog

Category Archives: code

Use deck.js as a remote presentation tool

Video Conferencing in HTML5: WebRTC via Socket.io

Video Conferencing in HTML5: WebRTC via Web Sockets

A systematic approach to making Web Applications accessible

Step-by-step Approach

1. Use native HTML tags

Example:

2. Make interactive elements keyboard accessible

2.1. Reaching interactive elements

Example:

2.2. Activating interactive elements

Example:

2.3. Elements in the right tab order

Example:

2.4. Focus on the right element

Example:

2.5. Provide sensible keyboard shortcuts

Example:

3. Provide markup for AT

3.1. ARIA roles

Example:

Example:

3.2. ARIA states

Example:

3.3. ARIA properties

Example:

3.4. Labelling

Example:

Example:

4. Conclusions

My first released WordPress plugin

W3C Media Annotations API standard

Accessibility support in Ogg and liboggplay

How to display seeked position for HTML5 video

Audio Track Accessibility for HTML5

Tutorial on HTML5 open video at LCA 2010