This article was published 8 yearsago

deepmind, ai

You might have seen crime action movies looking for the culprit in video footage extracted from the nearby bakery. But the footage obtained is low-res and the person cannot be identified. At this moment, the lead detective instructs the techie to zoom and enhance the quality of the image (sometimes multiple times). While this might seem fairytale-like in movies, Google researchers are working on making this a reality.

The Google Brain team has today published a research which details a similar process for obtaining high-resolution outputs from super low-resolution inputs. It has developed a new AI-powered software which uses a pair of neural networks to sharpen an 8 x 8 pixel image file and generate realistic samples — similar to the original. And you’d be amazed to see the correct image, with a massive amount of detail produced by the AI model.

The said AI software is cleverly making use of two neural networks to produce the images on the right from the one on the extreme left. The first one is called the ‘conditioning network’ and it tries to map the 8 x 8 pixel low-res image against similar high-res pictures. It then picks a match which will be used as a rough sketch to develop the final image.

The second one is called the ‘priority network’ and it employs PixelCNN architecture to define additional realistic details for natural images. It analyses the low-res image and tries to optimize the details of its pixels using existing images with similar pixel locations. The output from the first and second neural network are then combined to produce the output image — highly recognizable as compared to the input.

Talking about their sharpening process, the research paper reads,

A low resolution image may correspond to multiple plausible high resolution images, thus modeling the super resolution process with a pixel independent conditional model often results in averaging different details — hence blurry edges.

By contrast, our model is able to represent a multimodal conditional distribution by properly modeling the statistical dependencies among the high resolution image pixels, conditioned on alow resolution input.

Though the AI software isn’t quite optimized yet and thus, it might get some details wrong in the output image. But you’ll still be able to make out that the said final image is of a person’s face or a bedroom — the two use cases being employed by Google Brain researchers.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.