LCEVC 8k tv

The LCEVC Coding Standard and 8K Television

Introduction

First, for background, here is an overview of the most common video compression standard in use today, H264: https://www.maketecheasier.com/how-video-compression-works/

H265 has improved on this, but for various reasons, it was a bit of a flop. It is in use and will continue to grow, but royalty issues are a problem. To try snd address these, 3 new video standards have been approved by MPEG: https://ottverse.com/vvc-evc-lcevc-mpeg-video-codecs/

A consortium of companies has also endorsed a royalty-free codec called AV1. Here though, I will be talking mostly about the EVC baseline used at a resolution of 512p. Although differences exist between codec efficiency at that resolution, which is called SD resolution, it is not big – only about 20% between worst and best (not including H264). These days, it is not used much – people are moving to HD (720p) or Full HD (1024p). It continues with 4k now becoming common on streaming services like Netflix, and the push is on for 8K. At these higher resolutions, there is a big difference between codec efficiency and complexity. The bitrate for 4k over the internet has basically been solved, but 8k, up until now, has been a problem. It is hoped the new codecs will help to tame it to something like 20mbps. 4k is about 15mbps, although Netflix recommends 25mbps to be on the safe side. However, with new technology like shot based encoding, Netflix has made it even more efficient: https://hackaday.com/2020/09/16/dec…laining-optimized-shot-based-encoding-for-4k/

LCEVC and baseline EVC

Here I will be talking about LCEVC and baseline EVC. Baseline EVC at 512p is close to the efficiency of the most efficient codec, VVC, but has very low complexity and no royalties. It is mostly an arbitrary choice; any codec will do and not make much difference to efficiency.

The secret sauce of LCEVC is traditional video codecs do not compress the high-frequency part of video well. To fix it, a company called V-Nova came up with LCEVC. Using the following link, you can find a lot of information: https://www.lcevc.org/

For an overview, see: https://www.lcevc.org/how-lcevc-works/

The video is downscaled by 4 and encoded using a standard codec (EVC baseline in this writeup), but any codec will do. That is transmitted. Also added is the difference between the original and downscaled version. The method used to restore the original is to upscale it by 2, then add in the difference between the original downscaled by 2 and the upscaled version. When corrected, it is upscaled by 2 again, and the differences added to get the original. It is flexible in that you can skip the second downscale if you wish, plus add extra downscaling steps (although I do not believe V-Nova has done a 3 downscale version yet – but IMHO, it would be advantageous with 8k as explained later). They developed some ‘tricky’ techniques, such as M-Prediction and using the previous frame to guess the difference to make it more efficient. You can get the detail from the patent: https://patents.google.com/patent/WO2020188273A1/en

I have read it – it does take a while but is the only way to get exactly what is going on.

Since the encoding is done at a lower resolution, it is computationally quick. Slower presets that give better compression can be used and overall still encode quicker. Because the ‘upscaling’ is done using 2X2 or 4X4 blocks, the upscale can use many concurrent threads. Generally speaking, it adds little to the decoding time (with one exception I will mention later).

Even using just one downscale (I would expect better results with 2 or 3 downscales), performance has been impressive: https://www.lcevc.org/wp-content/up…rmance-of-LCEVC-Meeting-MPEG-134-May-2021.pdf

My Opinions

With 8k content, one should use 3 downscales. The encoder can do the base codec at 512p. At that resolution, the difference between the performance of codecs is minimal compared to the enhancement data – especially with the newer codecs. For example, royalty-free baseline EVC is only about 15% worse than the most efficient but complex VVC, with all its royalty woes. At the cost of increased processing, one can use task-aware upscaling (TAU) and downscaling, giving smaller corrections after upscaling and greater efficiency: https://cv.snu.ac.kr/research/taid/.

Experiments done by Samsung with its Scalenet technology (a version of Task Aware Upscaling) show that using this in going from 4k to 8k, it is very, very difficult to tell the difference (I have heard it has a VMAF of nearly 100 and PSNR of a bit below 40). Considering it is hard to detect the difference between 8k and downscaled to 4k, the third layer will not be doing much work – it will all be in the first and second layers. Since they will also be using Task Aware Upscaling for the other upscales, they will also be more efficient. The downside is the amount of processing power required – but processing power is still getting cheaper – it will likely happen. Still, even without the assistance of superresolution, reductions in bitrates of 40% to 50% are entirely possible. See: https://8kassociation.com/lcevc-licensing-offers-different-model-to-kickstart-8k-market/

VMAF 93 and above is generally considered to be for all practical purposes indistinguishable from the original, and using H265 with LCEVC was achieved at about 20mbs. 8K transmission is entirely possible now at bit-rates nearly everyone has available. By Per Scene Encoding Netflix has made it even more efficient: https://hackaday.com/2020/09/16/dec…laining-optimized-shot-based-encoding-for-4k/

Look for even 20mbps to be reduced significantly. Companies are working on integrating it with Content Adaptive Encoding (CAE) that lowers it even further: https://blog.beamr.com/2019/09/11/cabr-content-adaptive-rate-control/

Harmonic managed to get 8K at about 25mbps using just CAE – no LCEVC. It is working on combining it with LCEVC at the moment. Combining all these ‘tricks’ means in the future, distributing 8K content will be not an issue at all. Even now, it is not really an issue – we need the content to make it worthwhile and some new infrastructure.

4 replies
  1. bhobba says:
    There has been an interesting new idea in AI upscaling that would dovetail nicely with LCEVC:
    https://arxiv.org/pdf/2012.00650.pdf

    I think it could eventually become part of the standard, leading to even greater efficiency gains and bitrate reductions. It is early days yet, and like will be even better with further development, e.g. the idea of adaptive base revolution would be easy to implement in LCEVC.

    Thanks
    Bill

  2. bhobba says:
    WTF! Isn’t that exactly what Compuserve did with the GIF format? The whole point of patent pools is to stop corporations from doing that by having the cumulative rights worked out in advance. Lawyers make communism look good every day.

    I am sure it is a common tactic. The point is new codecs want to avoid it – at least try anyway.

    Thanks
    Bill

  3. bhobba says:
    Yes, but that is what people were doing in the 1990s. The text is written as if this was a new idea for H266.

    Many compression ideas have been around for a long time – many over the 20 years royalties apply. The simple compression method I gave is just one of them. That is the idea behind the EVC baseline I mentioned. It uses nothing but ideas whose patent/royalty has expired. The interesting thing about EVC (baseline) is that it encodes very fast and has compression efficiency nearly as good as HEVC, loaded with royalty problems. In fact, perceptually, it could be as good as HEVC:

    HEVC bought about the paradigm shift to considering patent/royalty issues. It used a lot of then-new ideas in the standard, which was ratified in 2013. At first, everything was fine, but the patent holders of those ideas were ‘sneaky’. What they did is wait until HEVC was widely adapted. Then wham – hit users with royalties for their patent. I have heard in discussions (but do not hold me to it) big companies like Samsung have teams of lawyers just working on that alone. Nobody (except maybe the patent holders) wants a repeat of that debacle. That is not to say HEVC will not be in use for some time. At the moment, X264 is the dominant codec, but it is expected over the next few years, HEVC will be the dominant codec – then it too will fade, being replaced by newer codecs. But the lesson has been learned – royalties must be simplified, and a lot of effort is going into doing just that. VVC is the natural successor to HEVC and builds on it. In doing so, it is loaded with the same royalty issues as HEVC. The codec developers are aware of it and will do all they can to ensure it is manageable. But it may not succeed. EVC handles it differently by having the baseline and the main profile. The main profile uses techniques that can be switched on or off. If the same royalty tactics that were used with HEVC is tried, they switch it off.

    Then we have AV1:
    https://research.mozilla.org/av1-media-codecs/

    Everything in it is royalty-free. It compresses better than the EVC baseline. Its problem is compression time – it is horrid. Work is being done to reduce it, with some success. Many think it will become the dominant codec.

    Now let’s return to LCEVC. The straightforward compression method I detailed is likely royalty-free. But the tricks developed by V-Nova to make it better are not. V-Nova is very aware of the royalty issue and has released the royalties they will charge, which the industry, in general, seems to be happy with:
    https://www.streamingmedia.com/Articles/ReadArticle.aspx?ArticleID=147003

    Although I mentioned baseline EVC in my insights article, many think it is a natural fit for AV1. But when operating at 1/4 resolution, the differences in codec compression efficiency not as big an issue – it exists, of course, but the information in the enhancement layer compensates to some extent – it is the same regardless of codec. The big benefit of using LCEVC with the newer codecs is that encoding time is drastically reduced as it operates at lower resolutions. So by using AV1 at 1/4 resolution encoding, AV1 encoding time is manageable. Also, AV1 is a complex codec, where a lot can go wrong and frame dropouts occur. Experiments using it with and without LCEVC has been interesting:
    https://www.lcevc.org/wp-content/uploads/AV1-with-LCEVC-pub-1.pdf

    With a royalty structure the industry is happy with, the increase in compression efficiency, reduction in encoding time and dropout reduction in complex codecs, LCEVC looks to have a bright future.

    Thanks
    Bill

  4. bhobba says:
    Some of the information above reads rather strangely. The stuff about downscaling the video and sending the high frequencies separately is a tautology: We compressed the file by separating the high frequencies and compressing them. But how?

    Here is a simple explanation. Indeed downscaling means getting rid of the high-frequency information. One way of doing it is to average 4 pixels and send that. That downscales it by 2. Then to upscale it back, duplicate each pixel 4 times. But you have lost the ‘high frequency’ or ‘detail’ information. To restore that, you transmit the difference between the average in the downscaled 4 pixels and the original image. It is a better way of doing it since normal compressing methods are not efficient at compressing the high frequency or ‘detail’ information. An interesting thing about videos is the ‘detail’ information (i.e. high-frequency information) is sparse – i.e. is mostly zeros. This allows very efficient run-length encoding to be used:
    https://en.wikipedia.org/wiki/Run-length_encoding

    The above would work, but various tricks can be used to make it even better. One trick is to use bicubic (or others) upscaling and downscaling:
    https://en.wikipedia.org/wiki/Image_scaling

    The difference information would be smaller, so it is more efficient.

    A very sophisticated method would be using AI (in this context, a Convolutional Neural Network). That would be something like Tad-Tau I mentioned before:
    https://cv.snu.ac.kr/research/taid/

    Still, other tricks can be used. For example, guessing the difference from pixels near it and the previous frame.

    All these can be used in LCEVC – the detail can be found in the patent I linked to. As I posted, the bottom line is that using the HEVC codec and LCEVC; one can transmit a virtually identical 8k picture (93 VMAF) at 20mbs. The modern measure of how good a video is perceptually is called VMAF. Anything above 93 is considered imperceptibly different from the original. It is the early stages of LCEVC use, and things will get better. But as of now, 8k is possible at speeds a lot of people have.

    HEVC is a ‘failure’ because of royalty problems, not technical issues:
    https://www.streamingmedia.com/Articles/ReadArticle.aspx?ArticleID=122983

    Thanks
    Bill

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply