Computer vision software solutions & services

AI solutions that see, quickly analyze, and deliver actionable insights. Get a CV solution and a 42% boost in productivity right away.

Posted: Jun 15, 2017 Updated: Mar 24, 2025

Image-to-Image translation

adversarial networks Computer vision Convolutional neural networks Deep learning Git Linux lua OpenCV torch

Interested how sophisticated algorithms can turn the same winter landscapes into realistic summer photos? Keep reading.

Abto Software’s R&D engineers conducted comprehensive, meticulous research to investigate the possibilities of modern machine learning and the output quality it can potentially provide for translating winter landscapes into accurate summer photos.

We covered the research and investigation of different generative models and algorithm application methods. With a solid dataset, we adopted the most suitable model to handle the challenges associated with such a complex project.

We see the opportunities of transitioning winter photos in benefiting landscape design and some related areas. This way, landscape architects and designers could see clearer how, for example, snow-covered landscapes could possibly look like in all other seasons.

Conditional generative adversarial network

We examined and tested various models and methods, including CGAN, CVAE, and pixel-to-pixel translation. The CGAN – conditional generative adversarial network –, an extension to GAN, has shown to be the most suitable approach for the set task.

Its advantages can be reduced to:

Sharp and realistic synthesis of images
Excellent match for high-dimensional visual data
Semi-supervised learning
Bleeding edge computational technology

The CGAN is simply an extension to GAN that involves conditional generation of images by a generator model. The generative adversarial network is a ML framework commonly utilized to train generative models that relies on a generator designed to create new images, and a discriminator used to distinguish synthetic images.

By adding additional details, we receive faster convergence, which creates some patterns even for fake images. Another thing, this approach helps control the output by labeling the images to be further generated.

Generatior Network - Discriminator Network — cGAN Structure

Machine learning for accurate image-to-image translation

During the project’s scope our team faced the following challenges:

The complexity of the selected algorithm
The configuration of knobs
The aggregation and preparation of datasets
Long iterations

The team successfully continued the research and investigation by aggregating and preparing huge datasets. For the better understanding and editing of random outdoor scenes, we chose the Transient Attribute Dataset. But because of the perceptible lack of data, to achieve better outputs, we decided to utilize additional datasets. The details and samples of those can be found below.

The datasets

Dataset №1. Annotated photos from over 100 webcams with various outdoor scenes.

Our team:

Downloaded photographs annotated during crowdsourcing campaign
Picked and filtered photos that has the most vivid characteristics of the winter and summer seasons
Clustered the selected photos as pairs

That made 3000 pairs of both winter and summer landscapes, 640×480, scaled to 256×256.

Dataset №2. Four 10-hour high-definition videos recorded from the train during the Nordland Line Norway trip in all four seasons.

Our team:

Downloaded the mentioned videos
Cut all four videos into frames using FFmpeg
Auto-aligned the best frames using Python and Hugin

This made 9000 pairs of both winter and summer landscapes, 1000×1000, scaled to 256×256.

The results

Harnessing knowledge and experience in leveraging computation technology, in particular machine learning, our engineers successfully delivered a solution performing accurate image-to-image translation as an innovative analogy to automatic language translation.

The results can be found below.

Training set: 11.500 image pairs, testing set: ~600 image pairs. Here are the best setup details:

The best setup details:

286×286, scaled to 256×256
Horizontal mirroring
Conditional D model
PatchGAN

The variations

We would also like to share some samples of variations that happened during the training process.

The most typical variations:

Dataset manipulations and mixing
Image jitter, random mirroring, the number of epochs
Unconditional/conditional D models
PatchGAN/PixelGAN/ImageGAN

Technical details

Used hardware:

AWS p2.xlarge (NVIDIA GK210 12 GB GPU), CUDA Toolkit

Tech stack:

Python
NumPy
Pandas dataframe
Hugin tool
Pix2Pix service
OpenCV library

Give meaning to images,

analyze videos, and recognize random objects with the highest accuracy.

How we can benefit your business

Abto Software handles complexities related to computer vision to help mature businesses focus more on data. By harnessing great knowledge and experience in implementing artificial intelligence and its various subsets (machine and deep learning, ANN, NLP), our engineers deliver custom cutting-edge solutions.

We provide:

Image enhancement and restoration
Image filtering, deconvolution, transformation and alignment
Image segmentation and clustering
Image analysis and key feature detection

And design:

And more on-demand software.

Summary

Article Name

Image-to-Image Translation: How to Convert Winter Photos Into Summer

Description

Our Computer Vision engineers conducted research on image-to-image translation. They were able to transform winter photos into summer images.

Publisher Name

Abto Software

Publisher Logo

Input	ImageGAN: Slightly distorted	PixelGan: Very blurry	L1 regularization: Loss of low-level features

Computer vision software solutions & services

Image-to-Image translation

Conditional generative adversarial network