基于蜕变关系模式的光学字符识别系统软件评估方法
Software Evaluation Method of Optical Character Recognition System Based on Metamorphic Testing
摘要: 光学字符识别系统通过扫描图像,使得原本只能以图像形式存在的文字信息可以被计算机处理和分析,已在各领域广泛应用。光学字符识别技术的质量是重点,如何系统地对其质量进行评估仍是一项具有挑战性的工作,因为它的评估方法依赖于大量的标记数据,同时也依赖于人工判断。在本文中,从用户的角度对光学字符识别系统进行评估,帮助用户更好地了解这些系统的功能,进而选择适合自己的光学字符识别系统,以满足他们的特定需求。为了更好地评估光学字符识别系统,光学字符识别系统的特征,从用户角度出发,我们定义了两类蜕变关系模式。在此基础上,定义了六个抽象的蜕变关系,并选取了光学字符识别的广泛应用场景车牌识别来评估系统的质量。为保证在没有先验标注的情况下评估OCR系统的性能以及鲁棒性,本文提出了此方法。
Abstract: Optical Character Recognition (OCR) systems enable computers to process and analyze text information that originally exists only in image form through scanning, and have been widely adopted across various domains. The quality of OCR technology is of paramount importance, yet systematically evaluating its performance remains challenging as current assessment methods heavily rely on large volumes of labeled data and manual judgment. In this paper, we propose evaluating OCR systems from the user perspective to help users better understand system capabilities and select appropriate OCR solutions that meet their specific needs. To facilitate comprehensive evaluation, we define two categories of metamorphic relation patterns based on key OCR characteristics from the user perspective. Furthermore, we establish six abstract metamorphic relations and validate them through license plate recognition—a prevalent application scenario for OCR technology. The proposed method enables performance and robustness evaluation of OCR systems without requiring prior annotated data.
文章引用:金铃子, 牛佳. 基于蜕变关系模式的光学字符识别系统软件评估方法[J]. 软件工程与应用, 2025, 14(3): 585-595. https://doi.org/10.12677/sea.2025.143051

参考文献

[1] Du, Y., Li, C., Guo, R., et al. (2020) Pp-ocr: A Practical Ultra Lightweight OCR System.
[2] Agbemenu, A.S., Yankey, J. and Addo, E.O. (2018) An Automatic Number Plate Recognition System Using OpenCV and Tesseract OCR Engine. International Journal of Computer Applications, 180, 1-5. [Google Scholar] [CrossRef
[3] Ugale, M.K., Patil, S.J. and Musande, V.B. (2017) Document Management System: A Notion Towards Paperless Office. 2017 1st International Conference on Intelligent Systems and Information Management (ICISIM), Aurangabad, 5-6 October 2017, 217-224. [Google Scholar] [CrossRef
[4] Siek, M. and Soeharto, R. (2021) Developing Automated Optical Character Recognition System Using Machine Learning Algorithm to Solve Payment Verification Issues. 2021 3rd International Conference on Cybernetics and Intelligent System (ICORIS), Makasar, 25-26 October 2021, 1-6. [Google Scholar] [CrossRef
[5] Segura, S., Fraser, G., Sanchez, A.B. and Ruiz-Cortes, A. (2016) A Survey on Metamorphic Testing. IEEE Transactions on Software Engineering, 42, 805-824. [Google Scholar] [CrossRef
[6] Zhou, Z.Q., Xiang, S. and Chen, T.Y. (2016) Metamorphic Testing for Software Quality Assessment: A Study of Search Engines. IEEE Transactions on Software Engineering, 42, 264-284. [Google Scholar] [CrossRef
[7] Zhou, Z.Q., Sun, L., Chen, T.Y. and Towey, D. (2020) Metamorphic Relations for Enhancing System Understanding and Use. IEEE Transactions on Software Engineering, 46, 1120-1154. [Google Scholar] [CrossRef
[8] Deng, Y., Zheng, X., Zhang, T., Lou, G., Liu, H. and Kim, M. (2020) RMT: Rule-Based Metamorphic Testing for Autonomous Driving Models.
[9] Cao, Y., Zhou, Z.Q. and Chen, T.Y. (2013) On the Correlation between the Effectiveness of Metamorphic Relations and Dissimilarities of Test Case Executions. 2013 13th International Conference on Quality Software, Najing, 29-30 July 2013, 153-162. [Google Scholar] [CrossRef
[10] Zhou, Z.Q. (2010) Using Coverage Information to Guide Test Case Selection in Adaptive Random Testing. 2010 IEEE 34th Annual Computer Software and Applications Conference Workshops, Seoul, 19-23 July 2010, 208-213. [Google Scholar] [CrossRef
[11] Barus, A.C., Chen, T.Y., Grant, D., Kuo, F. and Lau, M.F. (2011) Testing of Heuristic Methods: A Case Study of Greedy Algorithm. In: Huzar, Z., Koci, R., Meyer, B., Walter, B. and Zendulka, J., Eds., Software Engineering Techniques, Springer, 246-260. [Google Scholar] [CrossRef
[12] Chen, T.Y., Kuo, F., Liu, H. and Wang, S. (2009) Conformance Testing of Network Simulators Based on Metamorphic Testing Technique. In: Lee, D., Lopes, A. and Poetzsch-Heffter, A., Eds., Formal Techniques for Distributed Systems, Springer, 243-248. [Google Scholar] [CrossRef
[13] Xu, Z., Yang, W., Meng, A., Lu, N., Huang, H., Ying, C., et al. (2018) Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline. In: Ferrari, V., Hebert, M., Sminchisescu, C. and Weiss, Y., Eds., Computer Vision-ECCV 2018, Springer International Publishing, 261-277. [Google Scholar] [CrossRef
[14] China-Balanced-License-Plate-Recognition-Dataset-330K (2023).
https://github.com/SunlifeV/CBLPRD-330k
[15] Liu, W., Chen, C., Wong, K.Y.K., Su, Z. and Han, J. (2016) STAR-Net: A Spatial Attention Residue Network for Scene Text Recognition.
[16] Baek, J., Kim, G., Lee, J., Park, S., Han, D., Yun, S., et al. (2019) What Is Wrong with Scene Text Recognition Model Comparisons? Dataset and Model Analysis. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 4714-4722. [Google Scholar] [CrossRef
[17] Xu, G., Ding, W., Fu, W., Wu, Z. and Liu, Z. (2021) Robust Learning for Text Classification with Multi-Source Noise Simulation and Hard Example Mining. Applied Data Science Track: European Conference, ECML PKDD 2021, Bilbao, 13-17 September 2021, 285-301. [Google Scholar] [CrossRef
[18] Todorov, K. and Colavizza, G. (2022) An Assessment of the Impact of OCR Noise on Language Models. Proceedings of the 14th International Conference on Agents and Artificial Intelligence, Volume 2, 674-683. [Google Scholar] [CrossRef
[19] Wang, W., Huang, J., Huang, J., Chen, C., Gu, J., He, P., et al. (2023) An Image Is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software. 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), Luxembourg, 11-15 September 2023, 1339-1351. [Google Scholar] [CrossRef
[20] Wildandyawan, A. and Nishi, Y. (2020) Object-Based Metamorphic Testing through Image Structuring.
[21] Barr, E.T., Harman, M., McMinn, P., Shahbaz, M. and Yoo, S. (2015) The Oracle Problem in Software Testing: A Survey. IEEE Transactions on Software Engineering, 41, 507-525. [Google Scholar] [CrossRef
[22] Chen, T.Y., Kuo, F., Liu, H., Poon, P., Towey, D., Tse, T.H., et al. (2018) Metamorphic Testing: A Review of Challenges and Opportunities. ACM Computing Surveys, 51, 1-27. [Google Scholar] [CrossRef
[23] Zhou, Z.Q., Tse, T.H., Kuo, F.-C. and Chen, T.Y. (2007) Automated Functional Testing of Web Search Engines in the Absence of an Oracle. Department of Computer Science, The University of Hong Kong, Tech. Rep. TR-2007-06.
[24] Zhou, Z.Q., Sun, L., Chen, T.Y. and Towey, D. (2020) Metamorphic Relations for Enhancing System Understanding and Use. IEEE Transactions on Software Engineering, 46, 1120-1154. [Google Scholar] [CrossRef
[25] Jin, D., Chen, Y., Lu, Y., Chen, J., Wang, P., Liu, Z., et al. (2021) Neutralizing the Impact of Atmospheric Turbulence on Complex Scene Imaging via Deep Learning. Nature Machine Intelligence, 3, 876-884. [Google Scholar] [CrossRef
[26] 蒋一纯, 刘云清, 詹伟达, 朱德鹏. 基于图像退化模型的红外与可见光图像融合方法[J]. 电子与信息学报, 2022, 44(12): 4405-4415.
[27] Zhang, M., Zhang, Y., Zhang, L., Liu, C. and Khurshid, S. (2018) Deeproad: Gan-Based Metamorphic Testing and Input Validation Framework for Autonomous Driving Systems. Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, Montpellier, 3-7 September 2018, 132-142. [Google Scholar] [CrossRef
[28] Zhu, J.-Y., et al. (2020) Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks.
https://arxiv.org/abs/1703.10593
[29] IDC: AI助力2024上半年中国公有云市场回暖[Z/OL].
https://my.idc.com/getdoc.jsp?containerId=prCHC52690424
, 2024-10-29.