报告题目:Microbiome data analysis using zero-inflated probabilistic PCA models
报告时间:2022年11月22日(周二)下午3:00-5:00
腾讯会议:811-324-675
报告摘要:The analysis of microbiome data is complicated by several statistical challenges. In particular, microbiome data produced by high-throughput sequencing are count-valued, correlated, high-dimensional, over-dispersed with excess zeros, and compositional. To describe and simulate microbial community data, we introduce a general framework called Zero-Inflated Probabilistic PCA (ZIPPCA) by extending probabilistic PCA from the Gaussian setting to multivariate and sparse count data. We propose a negative binomial ZIPPCA model for dimension reduction and data visualization, and a logistic normal multinomial ZIPPCA model for inferring microbial compositions. Using the negative binomial ZIPPCA model, we also propose an accurate and robust method for denoising microbiome data. We develop efficient variational approximation algorithms for maximum likelihood estimation and inference. We demonstrate the performance of the proposed methods on real microbiome data sets.
报告人简介:王涛博士,上海交通大学长聘副教授;交大-耶鲁生物统计与数据科学联合中心研究员;国际统计学会Elected Member。研究方向为生物统计和高维数据统计推断,在JASA,JRSSB,Biometrika,Genome Biology,Briefings in Bioinformatics,Bioinformatics等期刊发表论文四十余篇;主持国家自然科学基金面上项目和优秀青年科学基金项目。