编码和非编码DNA序列的可视化分析
The Visual Analysis of Coding and Non-Coding DNA Sequences
DOI: 10.12677/HJCB.2014.42003, PDF, HTML,  被引量 下载: 3,144  浏览: 10,277  国家自然科学基金支持
作者: 刘玉倩, 郑智捷:云南大学软件学院,昆明
关键词: 非编码序列图形表示方法概率测量Non-Coding Sequences Graphic Representation Technique Probability Measurements
摘要: DNA序列作为一种复杂的遗传信息,其具体特性不仅体现在编码序列之中,也包含在非编码序列之中。在高等生物体中主要基因成分为非编码序列,在ENCODE计划中,有证据表明,在人类基因中有98%为非编码形式,其中80%具有功能性,所以对编码区和非编码区的研究已经成为一类重要研究热点。本文提供的模型和实验结果,使用图形表示方法对编码区以及非编码区基因的差异进行区分。该模型采用的是对编码区以及非编码区的DNA序列进行分段概率测量,从而对不同的基因特征分布进行比较。
Abstract: DNA sequences include complex genetic information; their specific characteristics are contained in both the coding and non-coding sequences. Major gene components in higher levels of organisms are composed of non-coding sequences. In ENCODE project, there are evidences that 98% of the human genomes are non-coding forms and 80% of them with functions, so the research on coding region and non-coding region has become an important research hotspot. This paper provides models and experiment results which using visual representation techniques to distinguish differences between coding and non-coding sequences. This model uses probability measurements on the DNA sequences to coding and non-coding regions respectively to distinguish patterns identified from different sequences.
文章引用:刘玉倩, 郑智捷. 编码和非编码DNA序列的可视化分析[J]. 计算生物学, 2014, 4(2): 20-31. http://dx.doi.org/10.12677/HJCB.2014.42003

参考文献

[1] Kumar, S.S. (2005) Responsibilities in the post genome era: Are we prepared? Issues in Medical Ethics, 10, 150-151.
[2] Bernstein, B.E., Birney, E., Dunham, I., et al. (2012) An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57-74.
[3] Pennisi, E. (2012) Genomics. ENCODE project writes eulogy for junk DNA. Science, 337, 1159-1161.
[4] Ecker, J.R., Bickmore, W.A. and Barroso, I. (2012) Genomics: ENCODE explained. Nature, 489, 52-55. http://www.nature.com/nature/journal/v489/n7414/full/489052a.html
[5] Randić, M., Novič, M. and Plavšić, D. (2013) Milestones in graphical bioinformatics. International Journal of Quantum Chemistry, 113, 2413-2446.
[6] Staden, R. and Mclachlan, A.D. (1981) Codon preference and its use in identifying protein coding regions in long DNA sequences. Nucleic Acids Research, 10, 141-156.
[7] Michel, C.J. (1986) New statistics approach to discriminate between protein coding and non-coding. Journal of Theoretical Biology, 120, 223-236.
[8] 张春霆 (1999) 用几何学方法分析DNA序列. 中国科学基金, 3.
[9] Li, Q.P. and Zheng, Z.J. (2010) Spatial distributions for measures of random sequences using 2D conjugate maps. Proceedings of Asia-Pacific Youth Conference on Communication (APYCC) (ISTP), Kunming, 64-69.
[10] 张巍琼, 郑智捷 (2012) 基于不同产生机制的伪随机序列和DNA序列的随机性测量. 成都信息工程学院学报, 6, 548-555.