Google’s challenges of freeing VP8

Since On2 Technology’s stockholders have approved the merger with Google, there are now first requests to Google to open up VP8.

I am sure Google is thinking about it. But … what does “it” mean?

Freeing VP8
Simply open sourcing it and making it available under a free license doesn’t help. That just provides open source code for a codec where relevant patents are held by a commercial entity and any other entity using it would still need to be afraid of using that technology, even if it’s use is free.

So, Google has to make the patents that relate to VP8 available under an irrevocable, royalty-free license for the VP8 open source base, but also for any independent implementations of VP8. This at least guarantees to any commercial entity that Google will not pursue them over VP8 related patents.

Now, this doesn’t mean that there are no submarine or unknown patents that VP8 infringes on. So, Google needs to also undertake an intensive patent search on VP8 to be able to at least convince themselves that their technology is not infringing on anyone else’s. For others to gain that confidence, Google would then further have to indemnify anyone who is making use of VP8 for any potential patent infringement.

I believe – from what I have seen in the discussions at the W3C – it would only be that last step that will make companies such as Apple have the confidence to adopt a “free” codec.

An alternative to providing indemnification is the standardisation of VP8 through an accepted video standardisation body. That would probably need to be ISO/MPEG or SMPTE, because that’s where other video standards have emerged and there are a sufficient number of video codec patent holders involved that a royalty-free publication of the standard will hold a sufficient number of patent holders “under control”. However, such a standardisation process takes a long time. For HTML5, it may be too late.

Technology Challenges
Also, let’s not forget that VP8 is just a video codec. A video codec alone does not encode a video. There is a need for an audio codec and a encapsulation format. In the interest of staying all open, Google would need to pick Vorbis as the audio codec to go with VP8. Then there would be the need to put Vorbis and VP8 in a container together – this could be Ogg or MPEG or QuickTime’s MOOV. So, apart from all the legal challenges, there are also technology challenges that need to be mastered.

It’s not simple to introduce a “free codec” and it will take time!

Google and Theora
There is actually something that Google should do before they start on the path of making VP8 available “for free”: They should formulate a new license agreement with Xiph (and the world) over VP3 and Theora. Right now, the existing license that was provided by On2 Technologies to Theora (link is to an early version of On2’s open source license of VP3) was only for the codebase of VP3 and any modifications of it, but doesn’t in an obvious way apply to an independent re-implementations of VP3/Theora. The new agreement between Google and Xiph should be about the patents and not about the source code. (UPDATE: The actual agreement with Xiph apparently also covers re-implementations – see comments below.)

That would put Theora in a better position to be universally acceptable as a baseline codec for HTML5. It would allow, e.g. Apple to make their own implementation of Theora – which is probably what they would want for ipods and iphones. Since Firefox, Chrome, and Opera already support Ogg Theora in their browsers using the on2 licensed codebase, they must have decided that the risk of submarine patents is low. So, presumably, Apple can come to the same conclusion.

Free codecs roadmap
I see this as the easiest path towards getting a universally acceptable free codec. Over time then, as VP8 develops into a free codec, it could become the successor of Theora on a path to higher quality video. And later still, when the Internet will handle large resolution video, we can move on to the BBC’s Dirac/VC2 codec. It’s where the future is. The present is more likely here and now in Theora.


ADDITION:
Please note the comments from Monty from Xiph and from Dan, ex-On2, about the intent that VP3 was to be completely put into the hands of the community. Also, Monty notes that in order to implement VP3, you do not actually need any On2 patents. So, there is probably not a need for Google to refresh that commitment. Though it might be good to reconfirm that commitment.


ADDITION 10th April 2010:
Today, it was announced that Google put their weight behind the Theorarm implementation by helping to make it BSD and thus enabling it to be merged with Theora trunk. They also confirm on their blog post that Theora is “really, honestly, genuinely, 100% free”. Even though this is not a legal statement, it is good that Google has confirmed this.

53 thoughts on “Google’s challenges of freeing VP8

  1. Hey Silvia,

    Let me fill in a bit more as someone who was there originally…

    At the time the IP grant was made to Xiph (and it is not the same as the VP3 source download license On2 was using beforehand; only the Xiph grant is relevant here), the intent was pretty clear that On2 was granting ‘rights to VP3’ not just the source implementation. They were however very concerned that we continue to maintain a fully compatible source base for a contractually specified period as they had responsibilities to customers. The grant and the maintenance contract were done together.

    As time wore on, folks noticed that the primary doc in the license grant had the ambiguity you mention above (it repeatedly uses the boilerplate ‘On2 VP3 Software’ to refer to everything instead of differentiating more explicitly between the software and the nebulous format IP). For a few years it had been on the radar to get ‘fixed’. The lawyers always thought there were bigger priorities (so did I) because everyone involved was basically agreed it was no big deal, we were still friendly with On2, and we were going to get it fixed eventually.

    Then Google unexpectedly bought On2, so On2 wasn’t allowed to talk to us anymore. Half of the ‘enh, no big deal’ context is silent for now, so it seems from the outside that the ambiguity in one document is potentially troublesome. It would indeed be a nice and appreciated gesture from Google to clear it up once and for all, but it’s still no big deal. Apple still isn’t going to like us. MPEG-LA is still gonna be FUDding us to death. In short, we burn legal resources on addressing a non-issue in the Court of Public Opinion. Maybe we have to. But I don’t think any lawyers would be all that impressed.

  2. Hmm, actually, going back over the actual text it is considerably clearer than I’m remembering. I remember there were wording changes right before signing because we were obviously very worried about making sure there was no possibility of yanking back rights if On2 were taken over/sold…

    I’ll see if Tom Rosedale (our main lawyer) thinks this is an easy question on monday morning.

    1. Thanks for the clarification, Monty. It would be great to get this clarified. If the license is more inclusive of the VP3 technology rather than just the codebase, that would be awesome.

  3. I’ll ask. It might still be ‘not worth replying to’. We’ll see.

    And I feel a bit silly now too because I realize I’m just helping distract from a more important point. The patents On2 had on VP3 really had nothing to do with the format. They covered specific implementation optimizations. Unlike h264, it’s easy to implement VP3 while avoiding the On2 patents if it really does come down to that.

  4. It’s worth publishing that particular point since that fact is not widely known. The issue though is: people will believe that point more if On2/Google explicitly state it. Something like: None of the On2 patents cover anything about the the basic structure of VP3 or Theora – they are only for optimisations that can be achieved also in different ways.

  5. hi — Dan Miller (ex-On2) here. Just to say, when we did the deal with Xiph, our intent was most definitely to give away the IP completely, and allow anyone anywhere to use both the code and the patents to implement Theora (nee Vp3). Frankly I think there’s some bad mojo FUD flying around to imply all these terrible ways people will be burned. Reminds me of “Reefer Madness”.

    Isn’t it likely that in the 10 years since Vp3 was first released — and 6 years since APPLE LICENSED VP3 THEMSELVES for Quicktime — that if someone was going to enforce some patents, they would have done so by now? I realize that’s not a legal argument — but common sense needs to have some value here as well. Laws and lawyers were invented by people to facilitate our ability to interact. Allowing that process to dictate the interaction without any attempt to understand the intent of the parties involved is a real devolution of the whole point of having a society of laws.

    Just my humble 2c

    1. Thanks a lot, Dan, for your view. It is great to get these discussions into the public! I will add a note in the article pointing to your and Monty’s note.

    1. @Mike It may well be that Dirac goes nowhere. But from the quality that I have seen Dirac being able to encode for large screensizes in comparison to e.g. H.264, it certainly kicks ass. That tells me that the future of codecs will include Wavelets. We already have an open wavelet codec in Dirac, which is not currently applicable to the Web, because it doesn’t work well for small resolution video. But in future we will all expect large resolution video on the Web, so naturally I am stipulating there is a place for Dirac. Anything about the future is, of course, uncertain.

  6. It will be interesting to see where Dirac does go. As far as I know, the BBC doesn’t actually use it for their own content.

    As an aside, Dirac is not a codec, it’s a compression format and specification for video. Schroedinger and dirac-research are the two main codecs.

    -c

  7. ..as for the container, if Google’s interest is in a truly free format (and it might not be) then MOV, MP4 and even ISO base media file format are not necessarily the way forward as they are patent encumbered (the ISO process for MP4 requires Apple and Matsushita to be “willing to negotiate licenses under reasonable and non-discriminatory terms,” but that’s not the same as “irrevocably free”).

    Something like OGG would be the ideal companion for VP8 (especially as it is free and streams Theora already).

    Of course, perhaps Google doesn’t plan to open VP8 at all. Maybe they’ll just keep it to themselves and roll it out to YouTube to avoid H.264 royalties post 2016. Awww.. YouTube doesn’t play in your browser? It does in Chrome..

  8. Is there an on-line petition anywhere to get google to do the things mentioned in this blog, with opening the codec with xiph, and using ogg and vorbis/flac?

    1. @Michael B. Not that I know of. Please note that Google haven’t even indicated that they will free any of the technology that they acquired through On2. All of this is pure speculation.

  9. I think that this is asking too much. Google can’t be expected to indemnify everyone in the world against any submarine patent that anyone else might hold, even if they go after an independent implementation and not after the code that Google releases. That’s asking them to open their corporate treasury to every ambitious patent lawyer in the world. Since Google isn’t going to do that, the FUD merchants can then point to that blog post and say it’s just too risky to use something other than the “safe” H.264.

    There’s a mistaken belief that alternative codecs are uniquely vulnerable to submarine patents, while if you go with H.264 you’re safe, because you’ve licensed the patents. You aren’t safe; some other troll might have a patent that’s not in the pool, and the licensing body does not indemnify you. This has happened with MP3, for example.

    If Google licenses the patent with sufficiently generous terms to allow for FLOSS implementations, and certifies that they know of no patents that an implementation would not infringe, that would be sufficient, and would make it as safe to deploy VP8 as to pay for licenses and deploy H.264.

    1. @Joe Please note that I was not suggesting Google indemnify everyone. In fact, I deliberately provided the alternative of standardising the codec. I brought up the indemnification since it was what others have said would be necessary for them to pick up Theora or any other open codec to avoid extra submarine patent exposure. I do not think any company in their right mind will ever indemnify the world for any of their technology.

  10. By all that is holy and some that is not, *DON’T* use Ogg as container format. It is the single most flawed container in existence. There are plenty of alternatives, all of which are better.

    I have never heard about the MOV/MP4 container format being patent encumbered. Please provide us with some proof for your claim.

    1. @Jochen If I was Google and was in the process of publishing a codec freely, I would check On2’s patent search and repeat my own to give myself some confidence before exposing myself and before encouraging others to use my technology.

      As for MPEG-LA: remember that many of the patent holders in the H.264 space are actually members of MPEG, so in theory there is sufficient knowledge of the patent space in MPEG that it equates to an extensive patent search. They do put out a call for patents before standardising the technology. Any proprietary codec would have a hard time competing with that kind of patent knowledge.

      When Microsoft published their proprietary codec as VC-1, it soon turned out that there were a whole swag of patents related to VC-1 and a VC-1 Patent Portfolio License (“License”) had to be created. If Google really wanted to create the free and open codec, they would need to be pretty sure that this doesn’t happen to them, too.

    1. @Louise I am not aware of a container format from On2. Their codecs have traditionally lived in other containers, e.g. AVI, FLV, ASF, MOV (QuickTime) and MP4 (QuickTime-like). FLV is a container format developed by Macromedia/Adobe.

  11. @Kelly: We can debate whether or not some offbeat container has worse characteristics than Ogg, but among the common ones, Ogg is the ugly kid in town. Among its drawbacks are

    – not general purpose, it needs to be hacked/extended for each new codec it is supposed to hold,

    – there is no codec-independent way to signal keyframes,

    – timestamps are a complete mess, and, again, codec-dependent; for a very insightful writeup see http://hardwarebug.org/2008/11/17/ogg-timestamps-explored/

    – overhead is huge, can be 300% or more the overhead of MP4, for the infamous Big Buck Bunny YouTube vs. Theora vs. H.264 comparison at http://people.xiph.org/~greg/video/ytcompare/comparison.html the numbers are:

    17307153 bbb_theora_486kbit.ogv
    15009926 bbb_theora_486kbit.theora
    2107404 bbb_theora_486kbit.vorbis
    ——–
    189823

    17753616 bbb_youtube_h264_499kbit.mp4
    13898515 bbb_youtube_h264_499kbit.h264
    3796188 bbb_youtube_h264_499kbit.aac
    ——–
    58913

    I hope you believe me now..

  12. @Louise: FLV and MP4 are general-purpose container formats that can contain audio, video, subtitles and metadata in a variety of flavors.

  13. DonDiego, you troll this every chance you get. I’m getting tired of addressing it in one place, having the rebuttal entirely ignored, and then having you plaster it somewhere else, anywhere else that’s visible.

    Ogg is different from your favorite container. We know. It does not need to be extended for every new codec. It’s a transport layer that does not include metadata (that’s for Skeleton). Mp4 and Nut make metadata part of a giant monolithic design. Whoop-de-do. The overhead depends on how it’s being used (for the high bitrate BBB above, it’s using a tiny page size tuned to low bitrate vids, an aspect of the encoder that produced it, not Ogg itself). Etc, etc.

    Doing something different than the way you and your group would do it is not ‘horribly flawed’ it is just… different.

    We’re not dropping Ogg and breaking tens of millions of decoders to use mp4 or Nut just because a few folks are angry that their pet format came too late or because your country doesn’t have software patents. Where I live, patents exist. You’re free to do anything that you want with the codecs, of course. Go ahead and put them in MOV or Nut! As you loudly proclaim, you’re in a country that doesn’t have software patents, so you don’t have to care.

    Or, “for the love of all that is holy”, get over it. Last I checked you weren’t willing to use Theora either… so why exactly are you here…? Obvious troll is obvious.

  14. Since I have to rebut this *again* lest it grow legs:

    For the record, If I was redesigning the Ogg container today, I’d consider changing two things:

    1) The specific packet length encoding encoding tops out at an overhead efficiency of .5%. If you accept an efficiency hit on small packet sizes, you can improve large packet size efficiency. This is one of the things Diego is ranting about. We actually had an informal meeting about this at FOMS in 2008. We decided that breaking every Ogg decoder ever shipped was not worth a theoretical improvement of .3% (depending on usage).

    2) Ogg page checksums are whole-page and mandatory. Today I’d consider making them switchable, where they can either cover the whole page or just the page header. It would optionally reduce the computational overhead for streams where error detection is internal to the codec packet format, or for streams where the user encoding does not care about error detection. Again– not worth breaking the entire install base.

    At FOMS we decided that if we were starting from scratch, the first was a good idea and we were split on the checksums. But we’re not starting from scratch, and compatibility/interop is paramount.

    The third big thing Diego (and the mplayer community in general) hate is the intentional, conscious decision to allow a codec to define how to parse granule positions for that codec’s stream. Granpos parsing thus requires a call into the codec.

    The practical consequence: When an Ogg file contains a stream for which a user doesn’t have the codec installed…. they can’t decode the stream! *gasp* Wait… how is that different from any other system?

    What’s different is that the demuxer also can’t parse the timestamps on those pages that wouldn’t be decodable anyway. Also, see above, parsing a timestamp requires a call to the installed codec. The mplayer mux layer can’t cope with this design, and they won’t change the player. We’re supposed to change our format instead.

    Fourth cited difference is that Ogg is transport only and stream metadata is in Skeleton (or some other layer sitting inside the low level Ogg transport stream) rather than part of a monolithic stream transport design. Practical difference? None really. Except that their mux/demux design can’t handle it, and they’re not interested in changing that either.

    I hope this clarifies the years of sustained anti-Ogg vitriol from the Mplayer and spin-off communities. Could Ogg be improved? Sure! Is that a reason to burn everything and start over? DonDiego seems to think so.

  15. @DonDiego

    Your assertion that FLV supports a variety of is not quite true (depends on your definition of “variety” – having “two” could be considered “variety”).

    According to the spec (“http://www.adobe.com/devnet/flv/pdf/video_file_format_spec_v10.pdf”), FLV only supports the following Audio formats:
    PCM
    MP3
    Nollymoser
    G.711
    AAC
    Speex

    Likewise, only a few video formats are supported, namely:
    VP6
    H.263
    H.264

    Most importantly, it does not support free video and audio formats such as Theora and Vorbis.

    -c

  16. @Chris Smart: Technically, there’s nothing preventing FLV from supporting a much larger set of audio and video codecs. However, it’s generally only useful to encode codecs that the Adobe Flash Player natively supports since that’s the primary use case for FLV. Adding support for another codec is generally just a matter of deciding on a new unique ID for that codec.

    Deciding on a new unique ID for a codec is usually all that’s necessary for adding support for a new codec to a general-purpose container format. It’s why AVI is still such a catch-all format– just think of a new unique ID (32-bit FourCC) for your experimental codec.

    The beef we have with Ogg is — as Monty eloquently describes in his comment — that Ogg increases the coupling between container and codec layers. This adds complexity that most multimedia systems don’t have to deal with.

  17. @Monty

    Very interesting read!!

    It is scary how a container that is suppose to free us from the proprietary containers, can be so bad.

    I found this blog from a x264 developer
    http://x264dev.multimedia.cx/?p=292

    which had this to say about ogg:

    [quote]
    MKV is not the best designed container format out there (it

  18. > It is scary how a container that is suppose to free us from the
    > proprietary containers, can be so bad.

    It isn’t. It is very different from one what set of especially pretentious wonks expects and they’ve been wanking about it for coming up on a decade. None of this makes an ounce of difference to users, and somehow other software groups don’t seem to have any trouble with Ogg. For such a fatally flawed system, it seems to work pretty well in practice 😛

    Suggestions like ‘They should have just used MKV’ doesn’t make sense. Ogg predates MKV by many years, and interleave is a fairly recent feature in MKV.

    The format designed by the mplayer folks is named Nut. Despite many differences in the details, the system it resembles most closely… is Ogg. Subjective evaluation of course, but I always considered the resemblance uncanny.

    Last of all, suppose just out of old fashioned spite and frustration, Xiph says ‘No more Ogg for the container! We use Nut now!’ That… pretty much ends FOSS’s practical chances of having any relevance in web video or really any net multimedia for the forseeable future. …all to get that .3% and a design change under the blankets that no user could ever possibly care about. Sign me up!

  19. @DonDiego To be fair, in your file size example, you should provide the correct sums:

    == quote
    17307153 bbb_theora_486kbit.ogv
    15009926 bbb_theora_486kbit.theora
    2107404 bbb_theora_486kbit.vorbis

  20. @silvia: I *am* providing the correct sums! You are misreading my table. Let me reformat the table slightly and pad with zeroes for readability:

    17307153 bbb_theora_486kbit.ogv (the complete file)
    – 15009926 bbb_theora_486kbit.theora (the video track)
    – 02107404 bbb_theora_486kbit.vorbis (the audio track)
    ========
    00189823 (container overhead)

    17753616 bbb_youtube_h264_499kbit.mp4 (the complete file)
    – 13898515 bbb_youtube_h264_499kbit.h264 (the video track)
    – 03796188 bbb_youtube_h264_499kbit.aac (the audio track)
    ========
    00058913 (container overhead)

    So in this application, Ogg has more than 300% the overhead of MP4. Ogg is known to produce large overhead, but I did not expect this order of magnitude. Now I believe Monty that it’s possible to reduce this, but the purpose of Greg’s comparison was to test this particular configuration without additional tweaks. Otherwise the H.264 and AAC encoding settings could be tweaked further as well…

    I wonder what you tested when you say that in your experience Ogg files come out smaller than MPEG files. The term “MPEG files” is about as broad as it gets in the multimedia world. Yes, the MPEG-TS container has very high overhead, but it is designed for streaming over lossy sattelite links. This special purpose warrants the overhead tradeoff.

  21. Silvia: DonDiego was illustrating a broken-out subtraction. His numbers are correct, as is his claim; Ogg is introducing more overhead (1%). That’s almost certainly reduceable, but I’ve not looked at the page structure in Vorbose to be sure of that claim. .5%-.7% is the intended working range. It climbs if the muxer is splitting too many packets or the packets are just too small (not the case here).

    >So in this application, Ogg has more than 300% the overhead of MP4.
    >Ogg is known to produce large overhead, but I did not expect this
    >order of magnitude.

    Yes, Ogg is using more overhead. Let’s assume that a better muxer gets me .7% overhead (yeah, even our own muxer is overly straightforward and doesn’t try anything fancy; it hasn’t been updated since 1998 or so. “Have to extend to container for every new codec” jeesh…)

    So this is really a screaming fight over the difference between .7% and .3%?

    I don’t debate for a second that Nut’s packet length encoding is better, and that’s the lion’s share of the difference assuming the file is muxed properly. And if/when (long term view, ‘when’ is almost certainly correct) Ogg needs to be refreshed in some way that has to break spec anyway, the Nut packet encoding will be one of the first things added because at that point it’s a ‘why not?’. But until then there’s no sensible way to defend the havoc a container change would wreak and all for reducing a .7% bitstream overhead down to .3%. It would be optimising something completely insignificant at great practical cost.

  22. @Monty: You are giving me far too much credit! “for the love of all that is holy and some that is not, don’t do that” is a quote from Mans in reply to somebody proposing to add ‘#define _GNU_SOURCE’ to FFmpeg. I have been looking for an opportunity to steal that phrase and take credit as being funny for a long time. SCNR ;-p

    Speaking of memorable quotes I cannot help but point at the following classic out of your feather after trying and failing to get patches into MPlayer:
    http://lists.mplayerhq.hu/pipermail/mplayer-dev-eng/2007-November/054865.html

    ======
    Fine. I give up.

    There are plenty of things about the design worth arguing about… but
    you guys are more worried about the color of the mudflaps on this
    dumptruck. You’re rejecting considered decisions out of hand with
    vague, irrelevant dogma. I’ve seen two legitimate bugs brought up so
    far in a mountain of “WAAAH! WE DON’T LIKE YOUR INDEEEEENT.”

    I have the only mplayer/mencoder on earth that can always dump WMV2
    and WMV3 without losing sync. I just needed it for me. I thought it
    would be nice to submit it for all your users who have been asking for
    this for years. But it just ceased being worth it.

    Patch retracted. You can all go fuck yourselves. Life is too short
    for this asshattery.

    Monty
    =====

    We remember you fondly. I and many others didn’t know what asshat meant before, but now it found a place in everybody’s active vocabulary. I’m not being ironic BTW, sometimes nothing warms the heart more than a good flame and few have generated more laughter than yours 🙂

    The ironic thing is that your fame brought you attention and the attention brought detailed reviews, which made patch acceptance harder.

    I also failed getting patches into Tremor. You rejected them for silly reasons, but, admittedly, I did not have the energy to flame it through…

    For the record: I have no vested interest in NUT. Some of the comments above could be read to suggest that Ogg would be a good base when starting from a clean slate. This is wrong, Ogg is the weakest part of the Xiph stack. You know that, but there are people all around the internet proclaiming otherwise. This does not help your case, on the contrary, so I try to inject facts into the discussion. Admittedly, sometimes I do it with a little bit of flair of my own 😉

    Cheers, Diego

  23. @DonDiego

    a) I was indeed bucking up against rampant asshattery.

    b) Not sure how any of that is even slightly relevant to this thread.

    You’re bringing it up in some sort of attempt to shame or embarrass because you’ve lost on facts? For the record, I meant it when I said it then, and I don’t feel any differently now. And asshat is indeed a fabulous word.

    [FTR, you’ve had two patches rejected and several more accepted if the twelve hits from Xiph.Org Trac are a complete set.]

    Monty

  24. This blog is not for personal attacks, but only for discussing technical issues. Unfortunately, the discussion on these comments is developing in a way that I cannot support any longer. I have therefore decided to close comments.

    Thank you everyone for your contributions.

Comments are closed.