AI Transforms Humans to Goblins: A fun computer vision project

abios-engineering

In the Abios Computer Vision (CV) Team, we always try to stay up to date with the latest tech trends, especially in deep learning. Not only does this make our computer vision stay competitive, but it also drives engagement and enables the team to learn new skills.

The past few weeks, we have been experimenting with Generative Adversarial Networks (GANs), a family of deep learning models originally brought to light by Ian Goodfellow et al (chapter 6 p. 696).

The intuition behind GANs is to mimic the dynamic between two players, one who creates fake objects and another who tries to determine whether the objects are fake or not. To much of our despair, GANs have not gained traction in the industry. However, extensive research is conducted in the field, leading to different network architectures such as StyleGAN and CycleGAN.

This blog post will focus on how CycleGANs can be used to transform humans into creatures. Since we are a gaming company working with various fantasy titles, we thought: why not goblins and orcs?

How General Adversarial Networks work

Before we jump into the results, let’s dig a bit deeper into how GANs work. If you need some polishing on neural networks we suggest you look here

Generative Adversarial Networks have two sub-networks actively competing with each other: a generator and a discriminator. The generator network tries to generate realistic images, while the discriminator network tries to distinguish between a real image and a generated image. These two networks play a zero-sum game in the training phase.

Transforming humans into goblins

In order to transform a human into a goblin, we used two disjoint datasets (two datasets without any shared images)

The first dataset was an image set of esport players, consisting of approximately 1400 images. 

The second dataset was the kaggle dataset Goblin Portraits, which interestingly is a dataset generated by previously mentioned StyleGAN and BigGAN. The author of the dataset described the typical goblin traits to be green skin, pointy ears, and bad skin condition.

Three goblin images

We trained our model for 200 epochs, which took 24 hours using an NVIDIA 3080 TI.

For the purpose of illustrating the generator’s performance throughout training, we continuously tested the model on images of co-workers. The training results can be seen below.

Validation images showing progress throughout training.

The initial results confused us. The model showed great skill in translating almost any male co-worker into a goblin; however, almost no transformation occurred in the female instances.

Gender-biased models

We discovered that the root cause of this was quite simple. The dataset of esport players was heavily biased since it almost exclusively consisted of men.  

Neural networks generalise well on data that resembles what has been seen before but perform worse on unseen data. That being said, it was unexpected how the model generalised well within the male gender while not working for females at all.

To counteract this imbalance, we expanded the dataset with random images of women and retrained the model two times: once with only images of women, and once with a mix of men and women.

This somewhat increased our generator’s ability to translate women. As a step further, we excluded all men from the data set and tried training the model with only images of women.

The results can be seen below:

It was only at this point we reached a somewhat satisfactory image translation for females. However, training with only female images led to a much worse model for men.

These results are slightly worrying, as the generator training ended up in a local minima forcing its focus on optimising the results for men alone unless we took precautions and trained separate models.

Why it’s important to account for bias in modelling

Having diverse training data with both men and women solves the issues of gender discrimination. More importantly, it also highlights a larger aspect of data science: social responsibility.  Since data science and neural networks are getting more and more involved with everyday life, data scientists need to be wary of training models that do society good.

A real life example of this is the Microsoft Twitter bot Tay. Tay was initially a normal chat bot, but since it learned its behavior from interactions with other users, it was quickly introduced to racist slurs and began demonstrating inappropriate behavior.

These are the lessons we take away from this project, along with a bunch of new profile pictures on Slack.

Tech is always evolving. At Abios, we strive to be at the frontier of technological development to ensure that our products remain of high quality for our customer base. 

To ensure that we remain on top of the latest technology, we do these “skunkwork” projects from time to time. It enables the team to continue learning from different verticals while encouraging out-of-the-box thinking. We want our employees to thrive and encourage creativity. That’s why employees should feel empowered to learn and develop, even outside of the scope of our day-to-day operations. 

Are you curious how your skills could be implemented at Abios? We’re always looking for great people.

Feel free to check out our open positions to learn more!  

Written by:

Rasmus Johns, Hilding Köhler and Altan Şenel