One of the primary and challenging steps involved in this project is the acquisition of labelled dataset for training the deep neural networks. The networks require thousands of accurately labelled dataset to learn a task. Acquiring the labelled dataset is challenging since the labelling has to be mostly done manually which is time-consuming and subject to human errors and biases depending on the expertise level of the labellers. To deal with the scarcity of labelled real data, we rely on synthetically generated datasets for training our networks. With synthetic data, it is possible to acquire thousands of labelled images quickly and these images closely resemble the real images from microscope.
There are two mostly used methods for detection in deep learning: Object Detection and Image Segmentation.
(Green- correct predictions, Blue- false negatives, Red- false positives)
For classification, we considered 166 taxas of diatoms, taken from Rhin Meuse region. A main challenge while performing classification was inter-class similarity and intra-class variability in the different diatom classes. Inter-class similarity refers to the situation where diatoms belonging to different diatom classes have very similar visual appearances and so are not easily distinguishable. Intra-class variance is when diatoms belonging to same taxa have different visual appearances due to difference in view-points from which the images are acquired. This again causes confusion while classifying since the network fails to learn that the multiple appearances belong to the same class. In this work (https://arxiv.org/abs/2109.11891) we address this issue using an automatic clustering mechanism and triplet loss.
(a) Examples of diatoms with high inter-class similarity (b) Examples of a diatom class with high intra-class variance.
© 2019 – 2022 DREAM Lab – Georgia Tech Lorraine