Multi-environment adaptive positioning method based on UAV and satellite images

<p>Ling Wei, Hao Liang, Juncai Wang</p>

doi:10.25236/AJCIS.2024.071001

Academic Journal of Computing & Information Science, 2024, 7(10); doi: 10.25236/AJCIS.2024.071001.

Multi-environment adaptive positioning method based on UAV and satellite images

Author(s)

Ling Wei, Hao Liang, Juncai Wang

Corresponding Author:

Ling Wei

Affiliation(s)

School of Information Science and Engineering, Chongqing Jiaotong University, Chongqing, 400074, China

Download PDF
|
Download: 42
|
View: 1657

Abstract

Drone image geolocation aims to estimate the geographic location of drone-captured images. Given a query image with an unknown location, the task involves retrieving the most similar reference image from a database and using its GPS information to estimate the location of the query image. This is fundamentally an image retrieval problem, where deep neural networks are employed to learn effective image descriptors. However, current research primarily focuses on closing the gap between drone and satellite views, often leading to performance drops under real-world conditions such as rain and fog. This issue primarily arises because the dataset used for training the model does not fully capture the complex environments encountered in real-world applications, leading to a domain gap between training and testing. To address this challenge, we propose a dual-branch multi-environment adaptation network (MuSe-Net) designed to dynamically adjust and adapt to environmental changes. The network consists of two branches: the multi-environment style extraction network, which captures weather-related style information, and the adaptive feature extraction network, which uses an adaptive modulation module to minimize the style differences caused by environmental conditions. Extensive experiments on the University-1652 benchmark show that MuSe-Net delivers strong performance in geolocation across various environmental conditions.

Keywords

Deep Learning, Image Retrieval, Multisource Domain Generalization, Geo-Localization

Cite This Paper

Ling Wei, Hao Liang, Juncai Wang. Multi-environment adaptive positioning method based on UAV and satellite images. Academic Journal of Computing & Information Science (2024), Vol. 7, Issue 10: 1-7. https://doi.org/10.25236/AJCIS.2024.071001.

References

[1] Ding, L., Zhou, J., Meng, L., & Long, Z. (2020). A practical cross-view image matching method between UAV and satellite for UAV-based geo-localization. Remote Sensing, 13(1), 47.

[2] Zeng, Z., Wang, Z., Yang, F., & Satoh, S. I. (2022). Geo-localization via ground-to-satellite cross-view image retrieval. IEEE Transactions on Multimedia, 25, 2176-2188.

[3] Liu, L., & Li, H. (2019). Lending orientation to neural networks for cross-view geo-localization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5624-5633).

[4] Hu, S., Feng, M., Nguyen, R. M., & Lee, G. H. (2018). Cvm-net: Cross-view matching network for image-based ground-to-aerial geo-localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 7258-7267).

[5] Wang, T., Zheng, Z., Yan, C., Zhang, J., Sun, Y., Zheng, B., & Yang, Y. (2021). Each part matters: Local patterns facilitate cross-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology, 32(2), 867-879.

[6] Fu, Y., Wang, X., Wei, Y., & Huang, T. (2019, July). Sta: Spatial-temporal attention for large-scale video-based person re-identification. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 8287-8294).

[7] Sun, Y., Zheng, L., Yang, Y., Tian, Q., & Wang, S. (2018). Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the European conference on computer vision (ECCV) (pp. 480-496).

[8] Chattopadhyay, P., Balaji, Y., & Hoffman, J. (2020). Learning to balance specificity and invariance for in and out of domain generalization. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IX 16 (pp. 301-318). Springer International Publishing.

[9] Ilse, M., Tomczak, J. M., Louizos, C., & Welling, M. (2020, September). Diva: Domain invariant variational autoencoders. In Medical Imaging with Deep Learning (pp. 322-348). PMLR.

[10] Pan, X., Luo, P., Shi, J., & Tang, X. (2018). Two at once: Enhancing learning and generalization capacities via ibn-net. In Proceedings of the European conference on computer vision (ECCV) (pp. 464-479).

[11] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).

[12] Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7132-7141).