Land-Use Classification from Overhead Imagery [Completed in 2019]

This project is a collaboration with AgroParisTech, using segmentation of overhead imagery with deep learning to classify land use at a very fine-grained scale.

Our main contribution is the generation of synthetic pixel-wise annotated data for land semantic segmentation of overhead image across long period of time.

Figure 1: DeepLab (Chen, L et al. 2018) segmentation of the area in 2015 and 1955.

Land cover maps across time are essential to track the ecological footprint of land occupation by humans. They can also be used to predict the footprint of future town-planning project or guide river decontamination initiatives. With the advance of Deep Convolutional Neural Network, state-of-the art models such as DeepLab (Chen, L et al. 2018) reach 73% accuracy of maps with resolution of 50cm/pixel and classify each pixel among to 14 land categories (Richard, A et al. 2018).

Figure 2: 2015 data. Left-Right: Ground truth – DeepLab (Chen, L et al. 2018).

However, DCNN require a heavy amount of human annotated data which is highly time consuming. For example, it takes 8 hours for a human GIS expert to generate pixel-wise annotation on a 10 000 x 10 000 pixels image. Also, the image distribution change along the years so a model trained on one year cannot generalize to overhead images taken decades before. For example, the images taken in 2015 are digital BGR images whereas the 1955 images are digitized black and white analog images taken in 1955. The latter present a change in color domain, texture and saturation that prevent the generalization of a model trained on the 2015 data.

Figure 3: Example of pixel distribution change between data from 2015 and 1955.

Instead of repeating the annotation step on the 1955 dataset, we generate a synthetic 1955 dataset with annotations. To do so, for each image in the 2015 dataset, a new image is generated with the same content but with the style of the 1955 dataset using style transfer models (Johnson, J et al. 2016). Then a DCNN is trained on these images where the ground truth is the pixel-wise annotation of the 2015 dataset.

Figure 4: Comparison of synthetic data. (Left to right). Top: 2015 data and the conversion to BW. Bottom: matching 1955 data and the stylized 2015 data
  • Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2018). DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4), 834-848.
  • Johnson, J., Alahi, A., & Fei-Fei, L. (2016, October). Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision (pp. 694-711). Springer, Cham
  • Richard A, Benbihi A, Pradalier C, Perez V, Van Couwenberghe R, Durand P Automated segmentation of land use from overhead imagery. International Conference on Precision Agriculture, June 24th-27th 2018, Montreal, Canada.

© 2019 – 2023 DREAM Lab – Georgia Tech