Waleed Alzuhair, flickr

Bias and Fairness in Image Colorization

Do you need a geospatial computing platform?

Image colorization is the process of adding color to gray-scale images. To automate this process, colorization models using machine learning techniques are used, which are trained on large image sets. Training these models is done in a supervised fashion by using the images converted to gray-scale as input and the original color image as desired output. These colorization algorithms do not always have the right colorization. The figure below shows how the Golden Gate Bridge in San Francisco is colored white by a colorization algorithm. Even though the image looks believable, it is not historically correct - it should already be red at this point.

Figure 1. Gray-scale and white colored images of the Golden Gate Bridge

Our research looks into which types of error occur in colorized images, what causes these errors, and how we can use this information to improve the way colorization algorithms work. To answer these questions we performed a lot of calculations where we mainly calculated average colors and differences between colors. We made these calculations for 2% of the images in ImageNet, a dataset of over 14 million images which was used to train the image colorizer DeOldify.

The complete ImageNet dataset is readily available on the public folder of the Geospatial Computing Platform of ITC, which facilitates analysis. After writing the code for the calculations on a Jetson AGX computing unit using a shared folder, we executed our program on the PowerEdge Big Data computing units. Even though we did calculations for over 200.000 images, we never had to wait longer than a day for the results. This is quite impressive as most of our calculations are done on a pixel-level in the images. So far we have gathered many results such as the average picture of ImageNet as shown below.


Figure 2. Average image of ImageNet calculated on the platform

Currently we are analyzing and concluding the final results, with a publication as the goal. We are impressed with the performance of the computing platform and are curious to see what the future will bring! We certainly wouldn’t have come this far without it.

Floris Weers

Floris Weers and Frank Stapel are two master students currently working on their master thesis for their study Technical Computer Science with a specialization in Data Science & Technology.

Frank Stapel

Research in colorization algorithms is a project next to their master thesis which they hope to publish in 2022 with dr. D. Bucur.