Li X, Xu F, Tao F, Tong Y, Lyu X, Zhong J, Kaup A (2025)
Publication Type: Conference contribution
Publication year: 2025
Publisher: Institute of Electrical and Electronics Engineers Inc.
Conference Proceedings Title: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Event location: Hyderabad, IND
ISBN: 9798350368741
DOI: 10.1109/ICASSP49660.2025.10887646
Semantic segmentation of remote sensing images (RSIs) is essential for applications such as environmental monitoring, urban planning, and disaster management. Convolutional Neural Networks (CNNs) and their variants struggle to capture comprehensive spectral context for learning discriminative representations. In this paper, we propose a Spectrum-Enhanced Network (SPENet) that leverages the Frequency Transformer Block (FTB) to capture rich spectral context. FTB integrates Spectrum-Enhanced Attention (SEA) with Multi-Head Frequency Self-Attention (MH-FSA), incorporating more informative contextual cues. Specifically, SEA aggregates spectral statistics through covariance matrix normalization before applying channel-wise attention. By projecting feature maps onto the frequency domain, MH-FSA provides the network with a broader context, extending beyond the low-frequency focus of standard self-attention mechanisms. Extensive experiments on the ISPRS Potsdam and LoveDA datasets show that SPENet significantly outperforms state-of-the-art methods. Besides, the proposed SEA module notably rises average F1-score/overall accuracy/mean insert over union wiht more than 2.5/2.6%/2.3%, as demonstrated by ablation study.
APA:
Li, X., Xu, F., Tao, F., Tong, Y., Lyu, X., Zhong, J., & Kaup, A. (2025). A spectrum-enhanced attention model for semantic segmentation of remote sensing images. In Bhaskar D Rao, Isabel Trancoso, Gaurav Sharma, Neelesh B. Mehta (Eds.), ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Hyderabad, IND: Institute of Electrical and Electronics Engineers Inc..
MLA:
Li, Xin, et al. "A spectrum-enhanced attention model for semantic segmentation of remote sensing images." Proceedings of the 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025, Hyderabad, IND Ed. Bhaskar D Rao, Isabel Trancoso, Gaurav Sharma, Neelesh B. Mehta, Institute of Electrical and Electronics Engineers Inc., 2025.
BibTeX: Download