Artificial Intelligence in Video

  • Thread starter Thread starter bhobba
  • Start date Start date
  • Tags Tags
    Neural
Click For Summary
SUMMARY

This discussion focuses on the application of artificial intelligence in video processing, particularly through the use of Neural Networks, Convolutional Neural Networks (CNNs), and Generative Adversarial Networks (GANs) for image super-resolution. Key advancements include the TAD-TAU method for down-scaling images before super-resolution and the introduction of VMAF, a metric developed by NETFLIX to assess image quality. The discussion also highlights ISIZE, a technology acquired by SONY, which enhances encoding efficiency for high-resolution images while maintaining high VMAF scores. These concepts lay the groundwork for developing an all-AI video codec.

PREREQUISITES
  • Understanding of Neural Networks and their applications in image processing
  • Familiarity with Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs)
  • Knowledge of video encoding techniques and metrics such as VMAF
  • Awareness of image super-resolution methods and their significance in video quality
NEXT STEPS
  • Research the implementation of Convolutional Neural Networks (CNNs) in image processing
  • Explore the principles of Generative Adversarial Networks (GANs) for image super-resolution
  • Learn about VMAF and its advantages over traditional metrics like SSIM
  • Investigate the ISIZE technology and its impact on video encoding efficiency
USEFUL FOR

This discussion is beneficial for AI researchers, video processing engineers, and developers interested in enhancing video quality through advanced encoding techniques and artificial intelligence methodologies.

Messages
10,970
Reaction score
3,839
Behind the scenes, artificial intelligence usually makes use of what is known as a Neural Network:



In image applications, an implementation called a Convolutional Neural Network is often used:



In particular, for image super-resolution a General Adversarial Network or GAN is often used:



These form the basis of modern super-resolution:



For those who are interested in the details, see:
https://arxiv.org/abs/2204.13620

But things move on. Someone thought of using a CNN to down-scale the image first, then using super-resolution to recover the original image. One example is TAD-TAU:
https://openaccess.thecvf.com/conte...k-Aware_Image_Downscaling_ECCV_2018_paper.pdf

This is an example of an important AI concept - the Autoencoder:



Again, things move on, and it has been improved to not only be simpler and give improved performance but also allow down-scaling and super-resolution by arbitrary amounts, as well as encoding the colour in a resultant black and white image:
https://arxiv.org/pdf/2201.12576

So far, super-resolution has been done using lower-resolution images, but can also be done using a sequence of images from a video:
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4088133

It was mentioned to quantify how close a super-resolution image is to the original as perceived by a human being, and SSIM was invented. However, further work has been done on this, and a new measure, invented and used a lot by NETFLIX, has largely replaced it, called VMAF:
https://visionular.ai/vmaf-ssim-psnr-quality-metrics/

Image super-resolution is one of many proposals for reducing the bit rate of high-resolution images. ISIZE (recently acquired by SONY) preprocesses an image to make it more efficient to encode, yet still has a high VMAF:
https://discovery.ucl.ac.uk/10152967/1/SMPTE_v9_RPS.pdf

It produces substantial reductions in the bit rate of 8K video:
https://8kassociation.com/industry-info/8k-news/pre-encoding-8k-with-isize-bitsave/

A lot of ideas and concepts have been introduced in this post. If the reader has not seen them before, like anything new, it may take a while to get up to speed. However, they form the basis of my proposed method of an all-AI video codec,

My next post will be an overview of current video codecs, including EVC baseline, which forms the basis of the AI codec.

Thanks
Bill
 
Last edited:
  • Like
Likes   Reactions: russ_watters, FactChecker and jedishrfu
Computer science news on Phys.org
very interesting
 
  • Like
Likes   Reactions: bhobba

Similar threads

  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
Replies
4
Views
2K
Replies
10
Views
5K
  • · Replies 8 ·
Replies
8
Views
5K