基于近红外光谱与机器学习的圈枝/驳枝陈皮鉴别
Identification of Chenpi from Cutting and Grafting via Near-Infrared Spectroscopy and Machine Learning
DOI: 10.12677/oe.2025.154009, PDF,    科研立项经费支持
作者: 郑惠萍, 郑成勇*:五邑大学数学与计算科学学院,广东 江门
关键词: 近红外光谱圈枝驳枝陈皮鉴别SVM深度学习TimeMixerNear-Infrared Spectroscopy Cutting Grafting Chenpi Identification SVM Deep Learning TimeMixer
摘要: 圈枝和驳枝是园艺生产中两种重要的育苗方法。这两种方法生产的陈皮在市场上表现出明显的差异,准确区分它们至关重要。然而,传统的识别方法存在主观性强、效率低、准确性差等问题,难以满足大规模识别的需求。为了解决这些问题,本研究比较了三种传统机器学习算法(支持向量机(SVM)、随机森林(RF)和K近邻(KNN))和五种前沿的时间序列深度学习算法(TSMixer、MSDMixer、TimesNet、PatchMixer和TimeMixer)在基于近红外光谱数据区分圈枝陈皮和驳枝陈皮方面的性能。同时,本研究选取了最小最大归一化等多种数据预处理方法,深入探讨了不同预处理方法对各种算法性能的影响。实验结果表明,传统算法适用于对计算资源和时间要求较低的场景,而深度学习算法在数据量充足、计算资源丰富的条件下可以实现更准确的识别。此外,不同的数据预处理方法对算法的性能有显著影响。深度学习算法(如PatchMixer和TimeMixer)和传统算法(如SVM和KNN)在特定的预处理下可以达到或接近100%的平均准确率。本研究不仅为近红外光谱在植物栽培方法识别中的应用提供了实证支持,也为实际应用中的算法选择和数据预处理方法提供了重要参考。
Abstract: Cutting and grafting are two important seedling cultivation methods in horticultural production. The dried tangerine peels (Chenpi) produced by these two methods show obvious differences in the market, and it is crucial to accurately distinguish between them. However, traditional identification methods have problems such as strong subjectivity, low efficiency, and poor accuracy, making it difficult to meet the needs of large-scale identification. To address these issues, this study compared the performance of three traditional machine learning algorithms (Support Vector Machine (SVM), Random Forest (RF), and K-Nearest Neighbors (KNN)) and five state-of-the-art time series deep learning algorithms (TSMixer, MSDMixer, TimesNet, PatchMixer, and TimeMixer) in distinguishing between cutting and grafting based on near-infrared spectroscopy data. At the same time, this study selected various data preprocessing methods such as normalization and conducted an in-depth exploration of the impact of different preprocessing methods on the performance of various algorithms. The experimental results show that traditional algorithms are suitable for scenarios with low requirements for computing resources and time, while deep learning algorithms can achieve more accurate identification under the conditions of sufficient data volume and abundant computing resources. In addition, different data preprocessing methods have a significant impact on the performance of the algorithms. Deep learning algorithms (such as PatchMixer and TimeMixer) and traditional algorithms (such as SVM and KNN) can reach or approach an average accuracy of 100% under specific preprocessing. This study not only provides empirical support for the application of near-infrared spectroscopy in the identification of plant cultivation methods but also offers important references for algorithm selection and data preprocessing methods in practical applications.
文章引用:郑惠萍, 郑成勇. 基于近红外光谱与机器学习的圈枝/驳枝陈皮鉴别[J]. 光电子, 2025, 15(4): 83-92. https://doi.org/10.12677/oe.2025.154009

参考文献

[1] Tan, E., Li, F., Lin, X., Ma, S., Zhang, G., Zhou, H., et al. (2022) Comparative Study on Comprehensive Quality of Xinhui Chenpi by Two Main Plant Propagation Techniques. Food Science & Nutrition, 11, 1104-1112. [Google Scholar] [CrossRef] [PubMed]
[2] Pasquini, C. (2018) Near Infrared Spectroscopy: A Mature Analytical Technique with New Perspectives—A Review. Analytica Chimica Acta, 1026, 8-36. [Google Scholar] [CrossRef] [PubMed]
[3] Cozzolino, D. (2016) Near Infrared Spectroscopy and Food Authenticity. In: Espiñeira, M. and Santaclara, F.J., Eds., Advances in Food Traceability Techniques and Technologies, Elsevier, 119-136. [Google Scholar] [CrossRef
[4] Prananto, J.A., Minasny, B. and Weaver, T. (2020) Near Infrared (NIR) Spectroscopy as a Rapid and Cost-Effective Method for Nutrient Analysis of Plant Leaf Tissues C. In: Sparks, D.L., Ed., Advances in Agronomy, Academic Press, 1-49.
[5] Tsuchikawa, S., Ma, T. and Inagaki, T. (2022) Application of Near-Infrared Spectroscopy to Agriculture and Forestry. Analytical Sciences, 38, 635-642. [Google Scholar] [CrossRef] [PubMed]
[6] 余梅, 李嘉仪, 范伟, 等. 基于近红外光谱仪与模式识别方法的不同年份陈皮无损鉴别研究[J]. 食品研究与开发, 2021, 42(19): 171-178.
[7] 杨济齐, 沈婉莹, 魏晓芳, 等. 基于二维相关红外光谱的陈皮快速鉴别研究[J]. 中南药学, 2022, 20(3): 544-550.
[8] Zhang, X., Gao, Z., Yang, Y., Pan, S., Yin, J. and Yu, X. (2022) Rapid Identification of the Storage Age of Dried Tangerine Peel Using a Hand-Held near Infrared Spectrometer and Machine Learning. Journal of Near Infrared Spectroscopy, 30, 31-39. [Google Scholar] [CrossRef
[9] Yeo, I. (2000) A New Family of Power Transformations to Improve Normality or Symmetry. Biometrika, 87, 954-959. [Google Scholar] [CrossRef
[10] Biau, G. and Scornet, E. (2016) A Random Forest Guided Tour. Test, 25, 197-227. [Google Scholar] [CrossRef
[11] Patgiri, C. and Ganguly, A. (2021) Adaptive Thresholding Technique Based Classification of Red Blood Cell and Sickle Cell Using Naïve Bayes Classifier and K-Nearest Neighbor Classifier. Biomedical Signal Processing and Control, 68, Article ID: 102745. [Google Scholar] [CrossRef
[12] Feld, S.M. (2023) TSMixer in PyTorch.
[13] Zhong, S., Song, S., Zhuo, W., Li, G., Liu, Y. and Chan, S.G. (2024) A Multi-Scale Decomposition MLP-Mixer for Time Series Analysis. Proceedings of the VLDB Endowment, 17, 1723-1736. [Google Scholar] [CrossRef
[14] Wu, H., Hu, T., Liu, Y., et al. (2022) TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis. The 11th International Conference on Learning Representations, Kigali, 1-5 May 2023.
[15] Gong, Z., et al. (2023) PatchMixer: A Patch-Mixing Architecture for Long-Term Time Series Forecasting.
[16] Wang, S., Wu, H., Shi, X., et al. (2024) TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting. The 12th International Conference on Learning Representations, Vienna, 7-11 May 2024.