Academic Journal of Computing & Information Science, 2026, 9(4); doi: 10.25236/AJCIS.2026.090410.
Qianqian Zhang1, Jie Ying1, Yu Wang1, Le Fu2
1School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, China
2First Maternity and Infant Hospital, Tongji University , Shanghai, China
To address the segmentation challenges in endometrial cancer MR images, including blurred lesion boundaries, irregular morphology, and large scale variation, an improved YOLOv8 model was proposed. The original YOLOv8-seg served as the baseline, and three key algorithmic improvements were introduced. An efficient multi-scale attention module based on cross-spatial learning was embedded into the backbone network to enhance the discriminative ability for lesion regions. The SPPF structure in the backbone network was replaced with a focal modulation module to achieve adaptive modeling of multi-scale contextual information. In the neck network, a dual-path feature fusion module that integrates the re-parameterization concept with an improved CSP (Cross Stage Partial) structure was designed to strengthen the collaborative representation of local details and high-level semantic information. The model was trained and evaluated on a proprietary dataset consisting of 803 endometrial cancer MRI slices. Experimental results show that the model achieved Recall, IoU, Precision, and DSC values of 92.2%, 80.1%, 96%, and 88%, respectively, which verifies the effectiveness and advancement of the proposed method in endometrial cancer lesion segmentation.
Deep Learning, MRI; YOLOv8, Endometrial Cancer, Image Segmentation
Qianqian Zhang, Jie Ying, Yu Wang, Le Fu. Lesion Region Segmentation of Endometrial Cancer Based on an Improved YOLOv8. Academic Journal of Computing & Information Science (2026), Vol. 9, Issue 4: 77-85. https://doi.org/10.25236/AJCIS.2026.090410.
[1] Ying Jie, Huang Wei, Fu Le, et al. Weakly supervised segmentation of uterus by scribble labeling on endometrial cancer MR images[J]. Computers in Biology and Medicine, 2023, 167: 107582.
[2] Kalantar R, Curcean S, Winfield J M, et al. Deep learning framework with multi-head dilated encoders for enhanced segmentation of cervical cancer on multiparametric magnetic resonance imaging[J]. Diagnostics, 2023, 13(21): 3381.
[3] Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation [C]//Proc of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer International Publishing, 2015: 234-241.
[4] Zhou Zongwei, Siddiquee M M R, Tajbakhsh N, et al. Unet++: Redesigning skip connections to exploit multiscale features in image segmentation[J]. IEEE Transactions on Medical Imaging, 2019, 39(6): 1856-1867.
[5] Çiçek Ö, Abdulkadir A, Lienkamp S S, et al. 3D U-Net: learning dense volumetric segmentation from sparse annotation [C]//Proc of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer International Publishing, 2016: 424-432.
[6] Huang Huimin, Lin Lanfen, Tong Ruofeng, et al. Unet 3+: a full-scale connected unet for medical image segmentation [C]//Proc of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway, NJ: IEEE Press, 2020: 1055-1059.
[7] Chen Jieneng, Mei Jieru, Li Xianhang, et al. TransUNet: Rethinking the U-Net architecture design for medical image segmentation through the lens of transformers[J]. Medical Image Analysis, 2024, 97: 103280.
[8] Huang Wenlei, Xiao Hongxiang. AESC-TransUnet: attention enhanced selective channel transformer U-Net for medical image segmentation: W. Huang, H. Xiao[J]. Signal, Image and Video Processing, 2025, 19(9): 710.
[9] Ouyang D, He S, Zhang G, et al. Efficient multi-scale attention module with cross-spatial learning [C]//Proc of the 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway, NJ: IEEE Press, 2023: 1-5.
[10] Yang J, Li C, Dai X, et al. Focal modulation networks[J]. Advances in Neural Information Processing Systems, 2022, 35: 4203-4217.
[11] Wang C Y, Liao H Y M, Wu Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN [C]//Proc of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Piscataway, NJ: IEEE Press, 2020: 390-391.
[12] Badrinarayanan V, Kendall A, Cipolla R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
[13] Chen L C, Zhu Yukun, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation [C]//Proc of the European Conference on Computer Vision (ECCV). Berlin: Springer, 2018: 801-818.