摘要: 目的:探讨CT影像组学模型对良恶性甲状腺结节的鉴别诊断价值。方法:回顾性分析经手术病理学证实279例甲状腺结节患者,其中甲状腺恶性结节133例,良性结节146例。将患者按照8:2的比例随机分为训练集223例和测试集56例。采用联影URP组学软件基于术前平扫期、动脉期、静脉期CT图像提取影像组学特征,后经过特征筛选,得到影像组学评分(Radscore),并建立影像组学随机森林、逻辑回归、支持向量机模型。采用ROC曲线下面积(Area under the curve, AUC)评估3种模型的诊断效能;通过校准曲线和决策曲线(DCA)评估3种模型的预测效能和临床应用价值。结果:随机森林模型训练集和测试集的曲线下面积(Area under the curve, AUC)分别为0.849 [95%可信区间(CI):0.798~0.900]和0.788 (95% CI:0.665~0.911),灵敏度、特异度、准确率分别为85.8%、75.9%、79.6%和70.4%、73.9%、73.2%。逻辑回归模型训练集和测试集的AUC分别为0.817 (95% CI: 0.762~0.872)和0.803 (95% CI: 0.683~0.923),灵敏度、特异度、准确率分别为77.8%、72.2%、75.0%和74.5%、72.2%、73.3%。支持向量机模型训练集和测试集的AUC分别为0.817 (95% CI: 0.760~0.873)和0.808 (95% CI: 0.700~0.920),灵敏度、特异度、准确率分别为82.1%、73.9%、77.8%和77.8%、69.5%、71.4%。逻辑回归、随机森林、支持向量机三组模型的AUC两两比较的Delong检验结果分别为0.593 (随机森林–支持向量机)、0.751 (随机森林–逻辑回归)、0.831 (支持向量机–逻辑回归),提示三组模型在诊断效能上表现相近;校准曲线分析结果显示,训练集(随机森林、逻辑回归、支持向量机) Brier Score分别为0.162、0.173、0.172;测试集(随机森林、逻辑回归、支持向量机) Brier Score分别为0.186、0.183、0.179,提示三组模型的预测效能均表现良好;临床决策曲线分析提示三组模型的临床净收益显示良好。结论:CT影像组学鉴别甲状腺良恶性结节具有较高的诊断效能、较好的预测效能,且本研究影像组学模型稳定性良好。
Abstract: Objective: To investigate the diagnostic value of a CT-based radiomics model in differentiating benign from malignant thyroid nodules. Methods: A retrospective analysis was conducted on 279 patients with thyroid nodules confirmed by surgical pathology, including 133 cases of malignant thyroid nodules and 146 cases of benign nodules. The patients were randomly divided into a training set (223 cases) and a test set (56 cases) in an 8:2 ratio. Using United Imaging’s URP radiomics software, radiomic features were extracted from preoperative non-contrast, arterial phase, and venous phase CT images. After feature selection, a radiomics score (Radscore) was obtained, and three models—random forest, logistic regression, and support vector machine—were established based on radiomic features. The diagnostic performance of the three models was evaluated using the area under the receiver operating characteristic curve (AUC). The predictive performance and clinical application value of the three models were assessed through calibration curves and decision curve analysis (DCA). Results: The random forest model achieved area under the curve (AUC) values of 0.849 [95% confidence interval (CI): 0.798~0.900] and 0.788 (95% CI: 0.665~0.911) for the training and test sets, respectively. Its sensitivity, specificity, and accuracy were 85.8%, 75.9%, and 79.6% for the training set, and 70.4%, 73.9%, and 73.2% for the test set, respectively. The logistic regression model achieved AUC values of 0.817 (95% CI: 0.762~0.872) and 0.803 (95% CI: 0.683~0.923) for the training and test sets, respectively. Its sensitivity, specificity, and accuracy were 77.8%, 72.2%, and 75.0% for the training set, and 74.5%, 72.2%, and 73.3% for the test set, respectively. The support vector machine model achieved AUC values of 0.817 (95% CI: 0.760~0.873) and 0.808 (95% CI: 0.700~0.920) for the training and test sets, respectively. Its sensitivity, specificity, and accuracy were 82.1%, 73.9%, and 77.8% for the training set, and 77.8%, 69.5%, and 71.4% for the test set, respectively. Delong’s test results for pairwise comparisons of the AUCs among the logistic regression, random forest, and support vector machine models were 0.593 (random forest vs. support vector machine), 0.751 (random forest vs. logistic regression), and 0.831 (support vector machine vs. logistic regression), indicating similar diagnostic performance among the three models. Calibration curve analysis revealed Brier scores of 0.162, 0.173, and 0.172 for the training set (random forest, logistic regression, and support vector machine, respectively), and 0.186, 0.183, and 0.179 for the test set (random forest, logistic regression, and support vector machine, respectively), suggesting good predictive performance for all three models. Decision curve analysis indicated favorable clinical net benefits for all three models. Conclusion: CT radiomics demonstrates high diagnostic efficacy and good predictive performance in differentiating benign and malignant thyroid nodules, and the radiomics models in this study exhibit excellent stability.