基于马铃薯转录组数据的病毒组装软件比较
Comparing of Three Softwares for Virus Genome Assembly Based on Potato Transcriptome Data
DOI: 10.12677/HJCB.2022.123006, PDF,    国家自然科学基金支持
作者: 涂 振#, 钟子旸#, 孟繁烨, 李佳炜, 余 涛, 郑经涛, 夏军辉, 聂碧华*:华中农业大学园艺林学学院,湖北 武汉 ;邹 莹:恩施州农业科学院,湖北 恩施;张 舒*:湖北省农业科学院植保土肥研究所,湖北 武汉
关键词: 高通量测序转录组病毒从头组装High-Throughput Sequencing Technology Transcriptome Data Virus De Novo Assembly
摘要: 随着高通量测序技术的成熟和成本的降低,转录组数据呈现爆发式增长。转录组数据中除了包含寄主马铃薯自身的转录本以外,还可能包含寄主受到RNA病毒侵染而带来的病毒序列信息,因此可以低成本地从转录组数据中进行病毒基因组挖掘。本研究通过比较SOAPdenovo、IDBA-UD、Trinity 三种主流软件对RNA-seq数据的组装效果,发现Trinity软件组装得到的结果中序列信息最丰富,且长序列最多,但组装过程耗时较长;相对而言,SOAPdenovo和IDBA-UD耗时较短,但组装结果中序列信息较少且长序列较少,所以推荐使用Trinity软件进行基于转录组数据的病毒基因组组装。
Abstract: With the continuous maturity of high-throughput sequencing technology and the reduction of cost, transcriptome data show explosive growth. The potato transcriptome data not only contains the transcripts of the potato itself, but also contains the viral sequence information caused by the infection of viruses in the sample, so the virus genome mining can be carried out from the transcriptome data. In this study, the assembly results of three mainstream software (SOAPdenovo, IDBA-UD and Trinity3) were compared based on the same RNA-seq data, it was found that Trinity software resulted the most abundant sequence information and the longest sequences, but the assembly process took a long time; meanwhile, SOAPdenovo and IDBA-UD cost a relatively short time, but generated less sequence information and shorter sequences in the assembly results. Thus, it is recom-mended to use Trinity software to assemble virus genome based on transcriptome data.
文章引用:涂振, 钟子旸, 孟繁烨, 邹莹, 李佳炜, 余涛, 郑经涛, 夏军辉, 张舒, 聂碧华. 基于马铃薯转录组数据的病毒组装软件比较[J]. 计算生物学, 2022, 12(3): 40-48. https://doi.org/10.12677/HJCB.2022.123006

参考文献

[1] 白人朴. 关于我国马铃薯产业发展振兴的思考[J]. 农机科技推广, 2017(3): 4-6.
[2] 白艳菊, 李学湛, 文景芝, 杨明秀. 中国与荷兰马铃薯种薯标准化程度比较分析[J]. 中国马铃薯, 2006, 20(6): 357-359.
[3] Adams, I.P., Glover, R.H., Monger, W.A., et al. (2009) Next-Generation Sequencing and Metagenomic Analysis: A Universal Diag-nostic Tool in Plant Virology. Molecular Plant Pathology, 10, 537-545. [Google Scholar] [CrossRef] [PubMed]
[4] Rwahnih, M.A., Daubert, S., Golino, D., et al. (2009) Deep Sequencing Analysis of RNAs from a Grapevine Showing Syrah Decline Symptoms Reveals a Multiple Virus In-fection that Includes a Novel Virus. Virology, 387, 395-401. [Google Scholar] [CrossRef] [PubMed]
[5] Kreuze, J.F., Perez, A., Untiveros, M., et al. (2009) Complete Viral Genome Sequence and Discovery of Novel Viruses by Deep Sequencing of Small RNAs: A Generic Method for Diag-nosis, Discovery and Sequencing of Viruses. Virology, 388, 1-7. [Google Scholar] [CrossRef] [PubMed]
[6] Wang, Z., Gerstein, M. and Snyder, M. (2009) RNA-Seq: A Rev-olutionary Tool for Transcriptomics. Nature Reviews Genetics, 10, 57-63. [Google Scholar] [CrossRef] [PubMed]
[7] Batty, E.M., Nicholas, W.T.H., Amy, T., et al. (2013) A Modified RNA-Seq Approach for Whole Genome Sequencing of RNA Viruses from Faecal and Blood Samples. PLOS ONE, 8, e66129. [Google Scholar] [CrossRef] [PubMed]
[8] Shan, J., Song, W., Zhou, J., et al. (2013) Transcriptome Anal-ysis Reveals Novel Genes Potentially Involved in Photoperiodic Tuberization in Potato. Genomics, 102, 388-396. [Google Scholar] [CrossRef] [PubMed]
[9] Ai, Y., Jing, S., Cheng, Z., et al. (2021) DNA Methylation Af-fects Photoperiodic Tuberization in Potato (Solanum tuberosum L.) by Mediating the Expression of Genes Related to the Photoperiod and GA Pathways. Horticulture Research, 8, Article No. 181. [Google Scholar] [CrossRef] [PubMed]
[10] Liu, X., Chen, L., Shi, W., et al. (2021) Comparative Transcrip-tome Reveals Distinct Starch-Sugar Interconversion Patterns in Potato Genotypes Contrasting for Cold-Induced Sweet-ening Capacity. Food Chemistry, 334, Article ID: 127550. [Google Scholar] [CrossRef] [PubMed]
[11] Pevzner, P.A, Tang, H. and Waterman, M.S. (2001) An Eu-lerian Path Approach to DNA Fragment Assembly. Proceedings of the National Academy of Sciences, 98, 9748-9753. [Google Scholar] [CrossRef] [PubMed]
[12] Idury, R.M. and Waterman, M.S. (1995) A New Algorithm for DNA Sequence Assembly. Journal of Computational Biology, 2, 291-306. [Google Scholar] [CrossRef] [PubMed]
[13] Zhang, W., Chen, J., Yang, Y., et al. (2012) A Practical Comparison of De Novo Genome Assembly Software Tools for Next-Generation Sequencing Technologies. PLOS ONE, 6, e17915. [Google Scholar] [CrossRef] [PubMed]
[14] Wang, B., Ma, Y., Zhang, Z., et al. (2011) Potato Viruses in China. Crop Protection, 30, 117-1123. [Google Scholar] [CrossRef
[15] Bolger, A.M., Lohse, M. and Usadel, B. (2014) Trimmomatic: A Flexible Trimmer for Illumina Sequence Data. Bioinformatics, 30, 2114-2120. [Google Scholar] [CrossRef] [PubMed]
[16] Sparks, M.E., Gundersenrindal, D.E. and Harrison, R.L. (2013) Complete Genome Sequence of a Novel Iflavirus from the Transcriptome of Halyomorpha Halys, the Brown Marmo-rated Stink Bug. Genome Announcements, 1, e00910-13. [Google Scholar] [CrossRef
[17] Smith, G., Macias-Muñoz, A. and Briscoe, A.D. (2014) Genome Sequence of a Novel Iflavirus from mRNA Sequencing of the Butterfly Heliconius erato. Genome Announcements, 2, e00398-14. [Google Scholar] [CrossRef