运营商视联网融合运营平台基础架构研究与创新实践
Research and Innovative Practice on the Basic Architecture of the Video Surveillance Integrated Operation Platform for Operators
DOI: 10.12677/csa.2025.1510257, PDF,    科研立项经费支持
作者: 梁秉豪, 张传刚*, 袁明明, 肖红梅:浪潮通信信息系统有限公司,山东 济南
关键词: 视联网大模型运营商算力网络视频分析Internet of Videos Large Model Operator Computing Power Network Video Analysis
摘要: 当前,视联网系统在终端设备接入、算法场景适配和数据价值挖掘等方面均面临着较大挑战。本文围绕视频监控产业的数字化、智能化升级需求,从电信运营商视角系统梳理了视联网系统的发展现状和问题,并在此基础上设计了面向运营商的视联网融合运营平台。该平台基于接入层、网络层、平台层和应用层的分层解耦架构,实现了算网资源调度、基础视频处理、视频智能分析和运营数据分析等功能,并在运营商营业厅管理、餐饮连锁门店管理、市容市貌管理等场景开展落地验证。实践证明,通过引入该平台可有效降低设备接入成本、提升多场景适配能力。通过大模型与算力网络的深度融合,实现视频数据的智能分析与价值挖掘,为社会治理和传统产业升级提供高效支撑。
Abstract: Currently, video surveillance systems face significant bottlenecks in terminal device access, algorithm scenario adaptation, and data value mining. To promote the digital and intelligent upgrade of the video surveillance industry, this paper systematically reviewed the current development status and problems of video surveillance systems from the perspective of telecommunications operators. Based on this, a video surveillance integrated operation platform for operators is designed. This platform, based on a layered decoupling architecture of the access layer, network layer, platform layer, and application layer, realized functions such as computing and network resource scheduling, basic video processing, video intelligent analysis, and operation data analysis. We carried out innovative practices in scenarios such as the management of operator business halls, the management of chain restaurant stores, and the management of urban appearance and environment. Practical implementation demonstrates that this platform can effectively reduce device access costs, enhance multi-scenario adaptability, and through the deep integration of large models and computing power networks, achieve intelligent analysis and value mining of video data, providing efficient support for social governance and the upgrade of the traditional industries.
文章引用:梁秉豪, 张传刚, 袁明明, 肖红梅. 运营商视联网融合运营平台基础架构研究与创新实践[J]. 计算机科学与应用, 2025, 15(10): 151-162. https://doi.org/10.12677/csa.2025.1510257

参考文献

[1] 周建同, 杨海涛, 刘东, 等. 视频编码的技术基础及发展方向[J]. 电信科学, 2017, 33(8): 16-25,
[2] Farahani, R., Timmerer, C. and Hellwagner, H. (2024) Towards Low-Latency and Energy-Efficient Hybrid P2P-CDN Live Video Streaming. arXiv: 2403.16985.
https://arxiv.org/abs/2403.16985
[3] Dalal, N. and Triggs, B. (2005) Histograms of Oriented Gradients for Human Detection. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, 20-25 June 2005, 886-893. [Google Scholar] [CrossRef
[4] Girshick, R., Donahue, J., Darrell, T., et al. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 580-587. [Google Scholar] [CrossRef
[5] Redmon, J., Divvala, S., Girshick, R., et al. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[6] Cheng, T., Song, L., Ge, Y., et al. (2024) YOLO-World: Real-Time Open-Vocabulary Object Detection. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 16-22 June 2024, 16901-16911. [Google Scholar] [CrossRef
[7] Kirillov, A., Mintun, E., et al. (2023) Segment Anything. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, 1-6 October 2023, 3992-4003. [Google Scholar] [CrossRef
[8] Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2020) An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv: 2010.11929. [Google Scholar] [CrossRef
[9] Radford, A., Kim, J.W., Hallacy, C., et al. (2021) Learning Transferable Visual Models from Natural Language Supervision. arXiv: 2103.00020.
https://api.semanticscholar.org/CorpusID:231591445
[10] Yang, A., Yang, B., Zhang, B., Hui, B., Zheng, B., Yu, B., Li, C., Liu, D., et al. (2024) Qwen2.5 Technical Report. arXiv: 2412.15115.
https://arxiv.org/abs/2412.15115
[11] Zhu, D., Chen, J., Shen, X., Li, X. and Elhoseiny, M. (2023) MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models. arXiv: 2304.10592.
https://api.semanticscholar.org/CorpusID:258291930
[12] Zhao, P., Zhang, H., Yu, Q., et al. (2024) Retrieval-Augmented Generation for AI-Generated Content: A Survey. arXiv: 2402.19473.
[13] 韩建亭, 张夙. 基于智能终端的视频通信业务服务质量评测模型研究[J]. 电信科学, 2013, 29(4): 27-32.
[14] 胡敏达, 徐泽华, 杨东鹏. 运营商视联网产业发展分析: 现状、挑战与未来路径[J]. 通信企业管理, 2025(1): 46-48.
[15] 郝鹏. 视联网数据管理的挑战与机遇[J]. 中国战略新兴产业, 2025(5): 120-122.