HyperStyle, an adaptation of StyleGAN for image editing

Equipment Tel Aviv University researchers recently unveiled HyperStyle, which is a reversed version of the machine learning system NVIDIA StyleGAN2 which has been redesigned to recreate missing pieces when editing real world images.

StyleGAN is characterized by allowing to synthesize new faces of people with a realistic appearance, setting parameters such as age, gender, hair length, smile character, nose shape, skin color, glasses and photographic angle.

On the other hand, HyperStyle makes it possible to change similar parameters in existing ones, In other words, it allows you to create photographs without modifying their characteristic features and preserving the recognizability of the original face.

HyperStyle introduces hypernetworks to learn how to refine the weights of a previously trained StyleGAN generator relative to a given input image. Doing so enables optimization level rebuilds with encoder-like inference times and high editability.

For example, when using HyperStyle, can simulate a change in the age of a person in a photograph, change a hairstyle, add glasses, beard or mustache, make a picture look like a cartoon character or a hand-drawn picture, make a sad or happy face expression.

In this case, The system can be trained not only to change people's faces, but also for any object, for For example, to edit car images.

Most of the works that study inversion look for a latent code that more accurately reconstructs a given image. Some recent work has proposed image fine tuning of the generator weights to achieve a high quality reconstruction for a given target image. With HyperStyle, we aim to bring these generator tuning approaches to the realm of interactive applications by adapting them to an encoder-based approach.

We trained a single hypernetwork to learn how to refine generator weights relative to a desired target image. By learning this mapping, HyperStyle efficiently predicts generator target weights in less than 2 seconds per image, making it applicable to a wide range of applications.

The proposed method aims to solve the problem of reconstructing missing parts of an image during editing. The previously proposed techniques have addressed the balance between reconstruction and editing by fine-tuning the imager to replace portions of the target image while recreating editable regions that were originally missing. The downside of such approaches is the need for long-term targeted training of the neural network for each image.

The method based on the StyleGAN algorithm allows to use a typical model, pretrained on common image collections, to generate characteristic elements of the original image with a level of confidence comparable to algorithms that require individual training of the model for each image.

One of the advantages of the new method is the ability to modify images with a performance close to real time, in addition to the fact that the model is ready to train prepared for those people, cars and animals based on the collections from Flickr-the Faces-HQ (FFHQ, 70,000 high-quality PNG images of people's faces), The Stanford Cars (16 images of cars) and AFHQ (photos of animals).

In addition, a set of tools is provided to train your modelsas well as ready-to-use trained models of typical encoders and generators suitable for use with them. For example, there are generators available for creating Toonify-style images, Pixar characters, creating sketches, and even styling like Disney princesses.

Finally for those who are interested in knowing more About this tool, you can check the details In the following link.

It is also important to mention that the code is written in Python using the PyTorch framework and is MIT licensed. You can check the code at the following link.

LinuxAdictos

HyperStyle, an adaptation of StyleGAN for image editing

Leave a Comment Cancel reply