S5Mars: Semi-Supervised Learning for Mars Semantic Segmentation

Jiahang Zhang *     Lilang Lin *     Zejia Fan     Wenjing Wang     Jiaying Liu

* indicates equal contributions.

Wangxuan Institute of Computer Technology, Peking University, Beijing.

IEEE Transactions on Geoscience and Remote Sensing 2024.


Figure 1. Examples for each label category (highlighted in red). Our dataset includes 9 categories with a sparse annotation style.

Abstract

Deep learning has become a powerful tool for Mars exploration. Mars terrain semantic segmentation is an important Martian vision task, which is the base of rover autonomous planning and safe driving. However, there is a lack of sufficient detailed and high-confidence data annotations, which are exactly required by most deep learning methods to obtain a good model. To address this problem, we propose our solution from the perspective of joint data and method design. We first present a new dataset S5Mars for Semi-SuperviSed learning on Mars Semantic Segmentation, which contains 6K high-resolution images and is sparsely annotated based on confidence, ensuring the high quality of labels. Then to learn from this sparse data, we propose a semi-supervised learning (SSL) framework for Mars image semantic segmentation, to learn representations from limited labeled data. Different from the existing SSL methods which are mostly targeted at the Earth image data, our method takes into account Mars data characteristics. Specifically, we first investigate the impact of current widely used natural image augmentations on Mars images. Based on the analysis, we then proposed two novel and effective augmentations for SSL of Mars segmentation, AugIN and SAM-Mix, which serve as strong augmentations to boost the model performance. Meanwhile, to fully leverage the unlabeled data, we introduce a soft-to-hard consistency learning strategy, learning from different targets based on prediction confidence. Experimental results show that our method can outperform state-of-the-art SSL approaches remarkably.

Dataset

S5Mars dataset provides rich geomorphological data for terrain semantic segmentation, which can guide the rovers and support space research missions. The dataset includes 6,000 high-resolution images taken on the surface of Mars, by color mast camera (Mastcam) from Curiosity (MSL). The spatial resolution of RGB images in this dataset is 1200 × 1200. Our dataset is annotated at a pixel level in a deterministic sparse labeling style. There are 9 label categories, sky, ridge, soil, sand, bedrock, rock, rover, trace, and hole, respectively.

Data and Annotation

Figure 2. Some examples of annotated images in our dataset. (a) Images and (b) Segmentation labels.

Relevant Statistics

Figure 3. Numerical statistics on our S5Mars dataset. The figures show the richness of the categories contained in the image from two aspects: distribution of the number of labels and distribution of label area.

Figure 4. Visualization of pixel-level feature distribution on our dataset. Features are extracted by Swin pre-trained backbone and are visualized with t-SNE.

Compare with other datasets

Figure 5. Some image-label examples in different datasets: (a) AI4Mars. Due to the few defined categories, the annotation diversity and adequacy are insufficient. Meanwhile, there are some cases of mislabeling (red box). (b) Mars-Seg, which gives a complete pixel-level labeling. However, the label can be misleading when different categories mix up with each other (red box). (c) Our dataset S5Mars, which provides accurate labeling for regions with high confidence..

Method

Figure 6. The overview of the proposed framework for semi-supervised Mars semantic segmentation. We adopt a two-branch teacher-student architecture. Two novel augmentations are proposed as strong augmentations, AugIN and SAM-Mix. AugIN exchanges the statistics of the two samples, i.e., mean and standard deviation. SAM-Mix utilizes an off-the-shelf SAM to obtain the object binary masks to perform copy-paste operation, reducing the uncertainty of the augmented images. Finally, the model is optimized according to a soft-to-hard consistency learning strategy, utlizing both the soft labels and the hard labels based on the confidence.


Resources

Citation

@article{zhang2022s,
    title={S^5Mars: Semi-Supervised Learning for Mars Semantic Segmentation},
    author={Zhang, Jiahang and Lin, Lilang and Fan, Zejia and Wang, Wenjing and Liu, Jiaying},
    journal={arXiv preprint arXiv:2207.01200},
    year={2022},
}