基于卷积神经网络和Softmax的蛋白质二级结构预测
Protein Secondary Structure Prediction Using Convolutional Neural Network and Softmax
DOI: 10.12677/CSA.2019.92051, PDF,  被引量    科研立项经费支持
作者: 王蕾蕾*, 成金勇:齐鲁工业大学(山东省科学院),计算机科学与技术学院,山东 济南
关键词: 蛋白质二级结构卷积神经网络Softmax分类器Protein Secondary Structure Convolutional Neural Networks Softmax Classifier
摘要: 蛋白质二级结构预测是生物信息学的重要组成部分,在生物信息学领域具有重要意义。本文提出了一种新的卷积神经网络结合Softmax分类器的算法预测蛋白质二级结构。首先用改进的卷积神经网络对蛋白质氨基酸序列进行特征提取,然后把卷积神经网络中第三次卷积后的输出作为Softmax分类器的输入并进行训练和预测。我们将本文提出的方法在25PDB数据集上做了3-折交叉验证,结果证明蛋白质二级结构预测的准确率有提高。
Abstract: Protein secondary structure prediction belongs to bioinformatics, and it’s important in research area. In this paper, we propose a new prediction way of protein using convolutional neural networks and Softmax. First, the improved convolutional neural network is used to extract the characteristics of the protein amino acid sequence, and then the third convolved output in the convolutional neural network is used as input to the Softmax classifier, and these data are trained and predicted. The dataset is a typical 25PDB dataset for protein. In terms of accuracy, the method is the cross validation based on the 3-fold. The results demonstrate that the accuracy of protein secondary structure prediction is improved.
文章引用:王蕾蕾, 成金勇. 基于卷积神经网络和Softmax的蛋白质二级结构预测[J]. 计算机科学与应用, 2019, 9(2): 450-457. https://doi.org/10.12677/CSA.2019.92051

参考文献

[1] Dulbecco, R. (1986) A Turning Point in Cancer Research: Sequencing the Human Genome. Science, 231, 1055-1057. [Google Scholar] [CrossRef] [PubMed]
[2] Zvelebil, M.J. and Baum, J.O. (2007) Understanding Bioinformatics. Garland Sci-ence, USA. [Google Scholar] [CrossRef
[3] 岳俊杰, 冯华, 梁龙. 蛋白质结构预测实验指南[M]. 北京: 化学工业出版社, 2010.
[4] Kaufman, L. and Rousseeuw, P.J. (2009) Finding Groups in Data: An Introduction to Cluster Analysis. John Wiley & Sons, New York.
[5] Huang, Z. (1997) Clustering Large Data Sets with Mixed Numeric and Categorical Values. Proceedings of the 1st Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 21-34.
[6] LeCun, Y., Bengio, Y. and Hinton, G. (2015) Deep Learning. Nature, 521, 436. [Google Scholar] [CrossRef] [PubMed]
[7] 曲建岭, 杜辰飞, 邸亚洲, 等. 深度自动编码器的研究与展望[J]. 计算机与现代化, 2014(8): 128-134.
[8] 张阳, 刘伟铭, 吴义虎. 基于深信度网络分类算法的行人检测方法[J]. 计算机应用研究, 2016, 33(2): 594-597.
[9] LeCun, Y., Bottou, L., Bengio, Y., et al. (1998) Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278-2324. [Google Scholar] [CrossRef
[10] Memisevic, R., Zach, C., Pollefeys, M., et al. (2010) Gated Softmax Classification. Advances in Neural Information Processing Systems, 1603-1611.
[11] Long, J., Shelhamer, E. and Darrell, T. (2017) Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 3431-3440.
[12] Hubel, D.H. and Wiesel, T.N. (1962) Receptive Fields, Binocular Interaction and Functional Architecture in the Cat's Visual Cortex. The Journal of Physiology, 160, 106-154. [Google Scholar] [CrossRef] [PubMed]
[13] Hosmer Jr., D.W., Lemeshow, S. and Sturdivant, R.X. (2013) Applied Lo-gistic Regression. John Wiley & Sons, New York. [Google Scholar] [CrossRef
[14] 范永东. 模型选择中的交叉验证方法综述[D]: [硕士学位论文]. 太原: 山西大学, 2013.
[15] Altschul, S.F., Gish, W., Miller, W., et al. (1990) Basic Local Alignment Search Tool. Journal of Molecular Biology, 215, 403-410. [Google Scholar] [CrossRef
[16] 邹权, 郭茂祖, 韩英鹏, 等. 多序列比对算法的研究进展[J]. 生物信息学, 2010, 8(4): 311-315.
[17] Zemla, A., Venclovas, Č., Fidelis, K., et al. (1999) A Modified Definition of Sov, a Segment-Based Measure for Protein Secondary Structure Prediction Assessment. Proteins: Structure, Function, and Bioinformatics, 34, 220-223. [Google Scholar] [CrossRef