【文章分享】16S rRNA基因拷贝数仍旧难以校准(一) — 美吉生物云-论坛
0
[文章]

【文章分享】16S rRNA基因拷贝数仍旧难以校准(一)

微生态专题案例分享分析方法专业知识

近期初接触微生物多样性的研究,很多专业知识需要补充,在学习16S rRNA结构时了解到了基因拷贝数这个概念,觉得很有意思,也查阅了些文献。在这里想为大家分享一篇2018年关于基因拷贝数研究的文章,一来督促自己学习,二来希望通过分享可以解决有相同困惑的小伙伴的疑虑。

也欢迎大家和我一起交流,共同进步嘛~

由于文章比较长,我分批次为大家分享吧,今天主要来分享摘要部分。


题目:16S rRNA基因拷贝数仍旧难以校准

IF:10.465

DOI:10.1186/s40168-018-0420-9


英文摘要:


The 16S ribosomal RNA gene is the most widely used marker gene in microbial ecology. Counts of 16S sequence variants, often in PCR amplicons, are used to estimate proportions of bacterial and archaeal taxa in microbial communities. Because different organisms contain different 16S gene copy numbers (GCNs), sequence variant counts are biased towards clades with greater GCNs. Several tools have recently been developed for predicting GCNs using phylogenetic methods and based on sequenced genomes, in order to correct for these biases. However, the accuracy of those predictions has not been independently assessed. Here, we systematically evaluate the predictability of 16S GCNs across bacterial and archaeal clades, based on 6,800 public sequenced genomes and using several phylogenetic methods. Further, we assess the accuracy of GCNs predicted by three recently published tools (PICRUSt, CopyRighter, and PAPRICA) over a wide range of taxa and for 635 microbial communities from varied environments. We find that regardless of the phylogenetic method tested, 16S GCNs could only be accurately predicted for a limited fraction of taxa, namely taxa with closely to moderately related representatives (≤15% divergence in the 16S rRNA gene). Consistent with this observation, we find that all considered tools exhibit low predictive accuracy when evaluated against completely sequenced genomes, in some cases explaining less than 10% of the variance. Substantial disagreement was also observed between tools (R2 0。5) for the majority of tested microbial communities。 The nearest sequenced taxon index (NSTI) of microbial communities, i。e。, the average distance to a sequenced genome, was a strong predictor for the agreement between GCN prediction tools on non-animal-associated samples, but only a moderate predictor for animal-associated samples。 We recommend against correcting for 16S GCNs in microbiome surveys by default, unless OTUs are sufficiently closely related to sequenced genomes or unless a need for true OTU proportions warrants the additional noise introduced, so that community profiles remain interpretable and comparable between studies。


Keywords: 16S rRNA, Gene copy number, Microbiome surveys, Phylogenetic reconstruction



中文摘要:


核糖体RNA中的16S基因是微生物生态学中应用最广泛的标记基因。通过对16S可变区进行PCR扩增,就可以估算出细菌和古菌在微生物群落中的比例。由于不同的生物体含有不同的16S基因拷贝数(GCN),具有较大GCN的物种在测序过程中会得到更多的序列从而使该物种比例往往会被高估。为了纠正这些偏差一些基于测序基因和系统发育法来预测GCN的工具被相继开发出来。然而,这些预测的准确性尚未得到评估。本文以6800个公开测序的基因组为基础,采用多种系统发育方法,系统评价了16sGCN在细菌和古细菌中的可预测性。此外,还评估了最近发表的三种工具(PICRUStcopyright erPAPRICA)在广泛的分类群和来自不同环境的635个微生物群落中预测的gcn的准确性。我们发现,无论采用何种系统发育方法,16S-GCN只能有限一部分分类群中能被准确得预测到。与这一观察结果一致当对完全测序的基因组进行评估时,所有的工具都显示出较低的预测准确性,在某些情况下方差解释度小于10%。对于大多数被测试的微生物群落工具之间也存在实质性差异(R2<0。5)。微生物群落的最相似序列分类单元指数(NSTI可以对非动物相关样本GCN进行有力得预测,但对动物相关样本的预测能力稍弱所以建议默认情况下不要在微生物研究中校正16SGCN除非OTU与已测序的基因组有足够密切关系,或者实验需要获得真实的OTU比例以保证引入额外的噪声,以便在研究之间保持群落结构的可解释性和可比性。

关键词:16srrna,基因拷贝数,微生物群调查,系统发育重建

 


后续的文字内容正在整理,过两天为大家更新(希望自己不要拖延



大个子 2019-12-20 15:51:29
  • 浏览(75)
  • 举报
  • 0个评论

    暂时还没有评论!

    热门话题

  • 微生态专题

    该话题下有71个讨论,149篇文章

  • RNA专题

    该话题下有24个讨论,116篇文章

  • 蛋白组专题

    该话题下有6个讨论,27篇文章

  • 代谢组学专题

    该话题下有4个讨论,17篇文章

  • DNA专题

    该话题下有2个讨论,9篇文章

  • 六合在线 六合在线 六合在线 六合在线 六合在线 六合在线 六合在线 六合在线 六合在线 六合在线