Statistical inference for cell type deconvolution
Dongyue Xie et al.
Abstract
Integrating heterogeneous datasets across different measurement platforms poses fundamental challenges for statistical inference. An important example is cell type deconvolution, where cell type proportions in bulk RNA-seq data are estimated using reference single-cell data from different sources, leading to platform-specific scaling effects, measurement noise, and biological heterogeneity. Existing methods often treat estimated proportions as observed in downstream analyses, potentially compromising validity when comparing multiple individuals. We introduce measurement error adjusted deconvolution, a statistical framework for estimation and inference in deconvolution with externally approximated design matrices. We establish necessary and sufficient conditions for identifiability under arbitrary gene-specific cross-platform scaling differences and develop valid inferential procedures for both individual-level proportions and comparisons across individuals, accounting for gene–gene correlation and shared estimation uncertainty. Simulations and real-data analyses demonstrate competitive estimation accuracy and reliable statistical inference.
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.50 × 0.4 = 0.20 |
| M · momentum | 0.50 × 0.15 = 0.07 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.