[I wrote this more than 8 months ago, but didn't want to publish it at the time because I want us to solve the issues around video in HTML5 and not fight each other. But I've made some changes and I'm now ready to have it published.]
There’s a clash of ecosystems happening at the WHATWG mailing list around the need for the specification of a baseline codec for a future <video> tag in HTML.
The clash is mostly between the open community which want Ogg Theora as a recommended baseline codec and big vendors (Apple & Nokia), which wanted that recommendation taken out. They claim that such a recommendation has nothing to do in a HTML standard, which should specify tags but not recommend external file formats. From one perspective, I agree – some things are better left to the software engineers to decide and left open to the market. However, in this particular instance, I think it would be a big mistake not to specify a baseline video codec. In fact, it would in my mind make the whole move to a new HTML5 standard an irrelevant exercise.
Let’s look at history and play a mind game on the consequences of such a decision.
Around the turn of the century we had a wonderfully diverse situation: we had RealMedia, QuickTime and WindowsMedia all being video formats that people expected to find on the Internet and to stream video. It most certainly made business sense to the involved companies! However, it made no business sense to Web developers and media content producers. They had to set up a transcoding and streaming infrastructure for all these three formats in parallel if they were wanting to reach all their potential clientele. I have actually seen this happening here in Australia at the ABC, which has a mandate to serve all the Australian people and therefore had to provide video in all potential formats. I remember the pain that was written across the faces of the infrastructure people.
A few years fast forward and the ABC can now give sighs of relief: supporting Adobe Flash, they can do away with all this expensive and support-intensive infrastructure and just support one codec.
Another story from the past to keep in mind is the story of PNG and GIF http://www.libpng.org/pub/png/pnghist.html where the collecting of royalties on the GIF codec started the creation of the open and free PNG format, which became a W3C recommendation in 1996 (see http://www.w3.org/Press/PNG-PR.en.html). TBL states in there “We are seeing more of our Members adopt the format and are helping make it the industry standard.”
With these in mind, let’s try and project into the future.
Assuming we do not provide a baseline codec in the spec, what will happen is that we will see each browser adopt support for the codec that “makes business sense”, i.e. Microsoft will support WindowsMedia, and Apple will support QuickTime, while the rest will be looking for a “cheaper” codec which could e.g. be MPEG-1 or Ogg Theora. Or stated differently: we will end up with the same situation that we had around 2001 with streaming codecs, except that Web developers and content owners still have the choice of Flash through the object/embed tag. Who will we confuse? The consumers who will be wanting to create their own content and publish it online. They will want a free and interoperable option. Since that’s not to be had, they will choose what makes most sense on their OS platform – i.e. QuickTime on Macs (comes for “free”), WindowsMedia on Windows, and Ogg Theora on Linux. Yes, this makes business sense to some of us. It will certainly make Adobe happy because – as before – Flash will come out as the winner.
Assuming we do provide a baseline codec in the spec, a very similar situation will actually happen and the browsers will support different codecs initially, since Ogg Theora is just a recommendation, which will probably not be implemented in Apple or MS Web browsers. However, now, Web developer and content owners have a focus on what format they should be providing through the recommendation in the standard. And they will request support for the recommended baseline format from the vendors. So, there may actually be a chance that the confusing mess of codec formats may be sorted after a while. This is the chance we have to make things easier for Web developers and online businesses – and this is why a baseline codec is imperative.
What we now need is to address the issues of Apple, Nokia and MS with Ogg Theora. These are mostly around submarine patents. My suggestion is that the W3C pay an independent patent attorney to perform a patent research on Ogg Theora to address the perceived risks of the big vendors. If the patent search is as comprehensive as possible, we may reach a situation where the big vendors do not perceive the risk any longer. However, there is also a risk that Theora is found to infringe specific patents. I guess we will then either correct the codebase or just have put all our development efforts into Dirac. In any case – all the FUD that is currently being sent both ways can then be addressed more easily with some decent data behind it.
This is an awesome post about what HTML5 may look like given the current specification – the middle section is on video.
We are in the middle of a big technological change for the dear old World Wide Web. And it will have a massive impact on how we are using video on the Web.
The W3C is wondering how to go even beyond that onto a road that will make video a first-class citizen on the Web. Next week, a W3C Video Workshop will be held on that exact topic.
Funnily enough, when we described the aim of the Annodex project at CSIRO in the year 2000, we used those exact words: how to make video a first-class citizen on the Web. At that time, people thought we were crazy. Now that YouTube is a commonly accepted phenomenon, we can actually see the limitations of existing video technology on the Web: we can still not interact as naturally with video as we do with Web pages – we can still not search well for video – and we can still not mash-up video as easily as we do with HTML pages, e.g. through RSS feeds.
I will be travelling to the US next week to share our experiences on Annodex with the Web World and have my input on what the future of video on the Web should look like. To that end, I have submitted two position papers to the workshop – one on Temporal URIs and one on our experiences with Annodex and CMML. Check out the other cool talks on the agenda or even the full list of position papers that got submitted!
Also, I have just been asked whether I would like to be part of the “Future of Video and Next Steps” Panel on the second day of the workshop – a panel that has been very well selected to represent online and traditional video technology, content interests, and consumer interests. I am looking forward to a very lively discussion and a great overall workshop that may be the first step towards a better video web.
Video on the Web is still only at the beginning of its evolution – comparable to the evolution that film and movie theatres have gone through over the last hundred years. It’s awesome to be working on the next technology revolution and to see that the best is yet to come!