Our main contribution is the generation of synthetic pixel-wise annotated data for land semantic segmentation of overhead image across long period of time.
Land cover maps across time are essential to track the ecological footprint of land occupation by humans. They can also be used to predict the footprint of future town-planning project or guide river decontamination initiatives. With the advance of Deep Convolutional Neural Network, state-of-the art models such as DeepLab (Chen, L et al. 2018) reach 73% accuracy of maps with resolution of 50cm/pixel and classify each pixel among to 14 land categories (Richard, A et al. 2018).
However, DCNN require a heavy amount of human annotated data which is highly time consuming. For example, it takes 8 hours for a human GIS expert to generate pixel-wise annotation on a 10 000 x 10 000 pixels image. Also, the image distribution change along the years so a model trained on one year cannot generalize to overhead images taken decades before. For example, the images taken in 2015 are digital BGR images whereas the 1955 images are digitized black and white analog images taken in 1955. The latter present a change in color domain, texture and saturation that prevent the generalization of a model trained on the 2015 data.
Instead of repeating the annotation step on the 1955 dataset, we generate a synthetic 1955 dataset with annotations. To do so, for each image in the 2015 dataset, a new image is generated with the same content but with the style of the 1955 dataset using style transfer models (Johnson, J et al. 2016). Then a DCNN is trained on these images where the ground truth is the pixel-wise annotation of the 2015 dataset.
© 2019 DREAM Lab – Georgia Tech Lorraine