Academic Journal of Computing & Information Science, 2026, 9(3); doi: 10.25236/AJCIS.2026.090308.
Jiahao Cao
Zhengzhou University, Zhengzhou, Henan, China
To address critical industry pain points in the field of automated video editing—specifically the limited computing power of purely local modes, the prominent privacy risks of purely cloud-based modes, insufficient technology integration, and a lack of multi-scenario adaptability—this study draws upon theories of technology integration and data security to propose an automated video editing framework. This framework integrates CLIP and YOLOv8n with Large Models within a Cloud-Edge collaborative paradigm. The research constructs a three-tier technical architecture comprising "Edge-side Perception, Cloud-side Decision-making, and Collaborative Scheduling." It designs four core functional modules: intelligent material parsing, automated script generation, interactive rendering, and data security protection, thereby establishing a closed technical loop of "Lightweight Perception + Intelligent Decision-making + Elastic Collaboration." Furthermore, targeting three core scenarios—individual creators, enterprise users, and government media—the study formulates differentiated adaptation paths and implementation strategies, focusing on lightweight embedding, customized deployment, and open cooperation, respectively. This work provides a novel perspective for resolving the core industry conflict of "Compute-Security-Adaptability" and lays a solid theoretical foundation for subsequent engineering implementation and technological optimization.
Edge-Cloud Collaboration; Automated Video Editing; CLIP Model; YOLOv8n; Large Model Fusion
Jiahao Cao. Research on Application Scenarios and Implementation Pathways of Automated Video Editing Integrating CLIP/YOLO and Large Models under Cloud-Edge Collaboration. Academic Journal of Computing & Information Science (2026), Vol. 9, Issue 3: 63-71. https://doi.org/10.25236/AJCIS.2026.090308.
[1] Zhou Junlong, Hou Xiangpeng, Lan Lan, et al. Cloud-Edge-End Collaborative Computing and Intelligence[J]. Embedded Technology and Intelligent Systems, 2025, 2(4): 261-267.
[2] Lv Kun, Zhang Weixu, Jing Jipeng. Research on Disruptive Technology Topic Identification Based on CLIP-LDAGV Multimodal Information Fusion: A Case Study of the New Energy Field[J]. Journal of the China Society for Scientific and Technical Information, 2025, 44(3): 353-368.
[3] Hou Yonghong, Zheng Haochun, Gao Jiajun, et al. Zero-Shot Action Recognition Based on CLIP Model and Knowledge Database[J]. Journal of Tianjin University (Science and Technology), 2025, 58(1): 91-100.
[4] Liu Jie, Qiao Wensheng, Zhu Peipei, et al. Zero-Shot Referring Image Segmentation Based on Fine-Tuning of Image-Text Large Model CLIP[J]. Application Research of Computers, 2025, 42(4): 1248-1254.
[5] Vardhan S H, Tanya S, R Indra. Automating YouTube Video Uploads Using Cloud-Native Technologies[J]. International Journal of Science and Research Archive, 2025, 15(02): 900-907.
[6] Li Zeyu, Chen Yang, Zhao Wentao. Optimization of Lightweight YOLOv8n Object Detection Algorithm Based on Cloud-Edge Collaboration[J]. Computer Engineering and Applications, 2025, 61(8): 112-119.
[7] Wang Chen, Li Xue, Liu Chang. Research on Intelligent Video Editing Script Generation Driven by Large Models[J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(6): 1215-1224.
[8] Zhang L, Chen W, Li Y. Rapid Adaptation in Photovoltaic Defect Detection: Integrating CLIP with YOLOv8n for Efficient Learning[J]. Energy Reports, 2025, 1(2): 45-52.