基于ARM + DSP的异构平台优化加速算法
Optimization and Acceleration Algorithm for Heterogeneous Platforms Based on ARM + DSP
DOI: 10.12677/MOS.2023.123213, PDF,   
作者: 宋奇伟, 晋高成, 李丕丁:上海理工大学健康科学与工程学院,上海
关键词: OpenCL异构多核动态优化数据划分OpenCL Heterogeneous Multi-Core Dynamic Optimization Data Division
摘要: OpenCL编程模型应用于ARM + DSP异构多核平台存在核心利用率低、开发效率低等问题。本文基于AM5728异构开发平台,对OpenCL异构编程模型进行研究,提出了异构多核计算动态优化加速算法。分析了动态优化加速算法中的最优分配比例算法和数据划分原则,动态算法会根据运行情况动态调整相应参数。完成了测试系统的设计,对异构计算加速算法的相关参数进行测量,展示了Sobel算法、奇异值分解算法(SVD)分别采用计算加速驱动的结果和OpenCL异构编程模型的结果,分析两种不同方式下算法完成时间情况。测试结果表明优化加速算法使得Sobel算法执行时间降低至原执行时间的72.2%,SVD算法执行时间降低至原执行时间的80.2%。
Abstract: When OpenCL programming model is applied to ARM + DSP heterogeneous multi-core platform, there are still some problems such as low core utilization and low development efficiency. Based on AM5728 heterogeneous development platform, this paper studies the OpenCL heterogeneous pro-gramming model and proposes a dynamic optimization algorithm for heterogeneous multi- core computing. Analyze the calculation method of the optimal data segmentation ratio, determine the principle of data segmentation, and the dynamic algorithm will dynamically adjust the correspond-ing parameters according to the operation. The design of the test system is completed, and the re-sults of Sobel algorithm and Singular Value Decomposition (SVD) algorithm driven by computing acceleration and OpenCL heterogeneous programming model are shown. The completion time of the algorithm under two different modes is analyzed. The test results show that the optimization algo-rithm reduces the execution time of Sobel algorithm to 72.2% of the original execution time, and the SVD algorithm to 80.2% of the original execution time.
文章引用:宋奇伟, 晋高成, 李丕丁. 基于ARM + DSP的异构平台优化加速算法[J]. 建模与仿真, 2023, 12(3): 2318-2329. https://doi.org/10.12677/MOS.2023.123213

参考文献

[1] 李志勇, 卢松升. 基于多核DSP的3通道偏振图像FMT配准方法[J]. 电子测量技术, 2022, 45(19): 155-160.
[2] 王新玥. 面向图像算法的异构多核平台并行加速技术研究[D]: [硕士学位论文]. 济南: 山东大学, 2021.
[3] 江仲鸣, 杨全胜. 基于异构多核SoC的LT码编码硬件化技术研究[J]. 计算机工程与科学, 2020, 42(12): 2125-2132.
[4] 王威. 基于ARM-FPGA异构多核平台上图像处理算法的加速研究[D]: [硕士学位论文]. 西安: 西安电子科技大学, 2019.
[5] Ba-ji, T. (2018) Evolution of the GPU Device Widely Used in Ai and Massive Parallel Processing. 2018 IEEE 2nd Electron Devic-es Technology and Manufacturing Conference (EDTM), Kobe, J13-16 March 2018, 7-9. [Google Scholar] [CrossRef
[6] Memeti, S., Li, L., Pllana, S., Kołodziej, J. and Kessler, C. (2017) Benchmarking OpenCL, OpenACC, OpenMP, and CUDA: Programming Productivity, Performance, and Energy Consump-tion. Proceedings of the 2017 Workshop on Adaptive Resource Management and Scheduling for Cloud Computing, Washington DC, 28 July 2017, 1-6. [Google Scholar] [CrossRef
[7] 汪思彤. 基于异构嵌入式环境的OpenCL主机端架构的设计与实现[D]: [硕士学位论文]. 南京: 南京大学, 2021.
[8] Atef, A., Hagras, T., Mahdy, Y.B. and Janeček, J. (2018) Lower-Bound Complexity and High Performance Mechanism for Scheduling Dependent-Tasks on Heterogeneous Grids. Proceedings of 2018 International Conference on Innovative Trends in Computer Engineering (ITCE), Aswan, 19-21 February 2018, 1-7. [Google Scholar] [CrossRef
[9] 刘林东, 邬依林. 基于HEFT和CPOP的相关任务表调度算法[J]. 计算机系统应用, 2019, 28(3): 118-125.
[10] 张磊, 卢刚, 彭力. 基于ARM和DSP的双核嵌入式视频监控系统[J]. 计算机测量与控制, 2017, 25(6): 49-52+67.
[11] 赵鑫. 基于ARM + DSP双核的腹腔镜语音自动定位系统[D]: [硕士学位论文]. 天津: 天津大学, 2017.
[12] 吴树森, 董小社, 王宇菲, 等. UPPA: 面向异构众核系统的统一并行编程架构[J]. 计算机学报, 2020, 43(6): 990-1009.
[13] 李安民, 计卫星, 廖心怡, 等. 一种面向异构计算的结构化并行编程框架[J]. 计算机工程与科学, 2019, 41(3): 424-432.
[14] 朱寒, 林丽, 陈德全, 陈健. 基于多方向改进Sobel算子的PCB图像定位校正方法[J]. 电子测量与仪器学报, 2019, 33(9): 121-128.
[15] 章司怡, 陈熙源. 运动约束辅助的基于SVD-CKF的组合导航方法[J]. 电子测量与仪器学报, 2022, 36(4): 82-89.