学术讲座

【75周年学术校庆BG电子系列学术讲座】预告:胡懿娟:Integrative analysis of 16S marker-gene and shotgun metagenomic sequencing data improves efficiency of testing microbiome hypotheses

发布者:沈彤发布时间:2023-05-29浏览次数:10

报告题目Integrative analysis of 16S marker-gene and shotgun metagenomic sequencing data improves efficiency of testing microbiome hypotheses

报告人胡懿娟(美国艾默里(Emory)公共卫生学院)

报告时间20236210:00-11:00

报告地点:文波楼201教室

摘要The most widely used technologies for profiling microbial communities are 16S marker-gene sequencing and shotgun metagenomic sequencing. Surprisingly, many microbiome studies have performed both experiments on the same cohort of samples. The two datasets often yield consistent patterns in taxonomic profiles, highlighting the potential for an integrative analysis to improve power of testing these patterns. However, each dataset is subject to distinct experimental biases that systematically distort the measurements from their actual values in an experiment-specific manner. These experimental biases, together with partially overlapping samples and differential library sizes between the two datasets, pose tremendous challenges when combining the datasets. In this article, we introduce the first method, named LOCOM-I, for such an integrative analysis. The new method is based on our LOCOM model (Hu et al., 2022, PNAS), which employs logistic regression for testing differential abundance of taxa while remaining robust to experimental bias. Our new method combines data from both experiments for differential abundance tests, while accounting for differential experimental biases, assigning adaptive weights to each observation, and accommodating samples and taxa unique to an experiment. To benchmark the performance of the new method, we introduce two ad hoc approaches: applying LOCOM to pooled taxa count data and combining LOCOM p-values from analyzing each dataset separately. We demonstrate the uniform superiority of the new method through extensive simulation studies. An application to two real studies uncovered scientifically plausible findings that would have been missed by analyzing individual datasets.

 

报告人简介胡懿娟,美国艾默里(Emory)公共卫生学院生物统计与生物信息系教授,在北京大学数学科学学院概率统计系获得学士学位(2005)和在美国北卡教堂山大学获得生物统计学博士学位(2011)。致力于开发生物统计学中高维度、高噪声组学数据的统计理论和方法,特别针对微生物组数据和遗传数据中的高维假设检验、稳健推测、缺失/偏差数据等问题。代表工作发表于Journal of American Statistical Association (JASA) Proceedings of the National Academy of Sciences(PNAS) Microbiome American Journal of Human Genetics (AJHG) 等期刊。