AMD-HookNet++: Evolution of AMD-HookNet with Hybrid CNN-Transformer Feature Enhancement for Glacier Calving Front Segmentation

Wu F, Dreier MN, Gourmelon N, Wind S, Zhang J, Seehaus T, Braun M, Maier A, Christlein V (2025)


Publication Language: English

Publication Type: Journal article, Original article

Publication year: 2025

Journal

URI: https://ieeexplore.ieee.org/document/11296938

DOI: 10.1109/TGRS.2025.3642764

Abstract

The dynamics of glaciers and ice shelf fronts significantly impact the mass balance of ice sheets and coastal sea levels. To effectively monitor glacier conditions, it is crucial to consistently estimate positional shifts of glacier calving fronts. However, laborious manual mapping calving fronts in satellite observations requires a considerable expense. The Attention-Multi-hooking-Deep-supervision HookNet (AMD-HookNet) firstly introduces a pure two-branch convolutional neural network (CNN) for glacier segmentation. Yet, the local nature and translational invariance of convolution operations, while beneficial for capturing low-level details, restricts the model ability to maintain long-range dependencies. In this study, we propose AMD-HookNet++, a novel advanced hybrid CNN-Transformer feature enhancement method for segmenting glaciers and delineating calving fronts in synthetic aperture radar (SAR) images. Our hybrid structure consists of two branches: a Transformer-based low-resolution (context) branch to capture long-range dependencies, which provides global contextual information in a larger view, and a CNN-based high-resolution (target) branch to preserve local details. To strengthen the representation of the connected hybrid features, we devise an enhanced spatial-channel attention (ESCA) module to foster interactions between the hybrid CNN-Transformer branches through dynamically adjusting the token relationships from both spatial and channel perspectives. Additionally, we develop a pixel-to-pixel contrastive deep supervision for optimizing our hybrid model. It integrates pixel-wise metric learning into glacier segmentation by guiding hierarchical pyramid-based pixel embeddings with category-discriminative capability. Through extensive experiments and comprehensive quantitative and qualitative analyses on the challenging glacier segmentation benchmark dataset CaFFe, we demonstrate that AMD-HookNet++ sets a new state of the art with an intersection over union (IoU) of 78.2 and a 95th percentile Hausdorff distance (HD95) of 1,318 m, while maintaining a competitive mean distance error (MDE) of 367 m. More importantly, our hybrid model produces smoother delineations of calving fronts, resolving the issue of jagged edges typically seen in pure Transformer-based approaches.

Authors with CRIS profile

Related research project(s)

Involved external institutions

How to cite

APA:

Wu, F., Dreier, M.N., Gourmelon, N., Wind, S., Zhang, J., Seehaus, T.,... Christlein, V. (2025). AMD-HookNet++: Evolution of AMD-HookNet with Hybrid CNN-Transformer Feature Enhancement for Glacier Calving Front Segmentation. IEEE Transactions on Geoscience and Remote Sensing. https://doi.org/10.1109/TGRS.2025.3642764

MLA:

Wu, Fei, et al. "AMD-HookNet++: Evolution of AMD-HookNet with Hybrid CNN-Transformer Feature Enhancement for Glacier Calving Front Segmentation." IEEE Transactions on Geoscience and Remote Sensing (2025).

BibTeX: Download