Influence of Uninformative Prior Distributions for MCMC Method on Estimating Variance Components in Generalizability Theory
Guangming Li
Abstract
The Markov chain Monte Carlo (MCMC) method is more and more widely used to estimate variance components in generalizability theory (GT). However, as an essential part of MCMC method, uninformative priors haven't been explored and different GT researches vary in the use of uninformative priors. This study focused on effect of the different uninformative priors on estimating variance components. Based on p × i × r design, eight uninformative prior distributions were chosen for simulation study and empirical study, including σ 2 ∼ i n v - g a m m a ( 0.001 , 0.001 ) [prior 1], σ 2 ∼ i n v - g a m m a ( 1 , 1 ) [prior 2], σ 2 ∼ u n i f o r m ( 0.001 , 1000 ) [prior 3], σ ∼ u n i f o r m ( 0 , 100 ) [prior 4], log ( σ 2 ) ∼ u n i f o r m ( - 10 , 10 ) [prior 5], 1 σ 2 ∼ p a r e t o ( 1 , 0.001 ) [ prior 6 ] , σ 2 ( σ 2 + τ 2 ) 2 ∼ u n i f o r m [prior 7], and σ 2 2 τ ( σ + τ ) 2 ∼ u n i f o r m ( 0 , 1 ) [prior 8]. The three posterior point estimations (i.e., mean, median and mode) with full data and 10% missing/sparse data were as calculated as well. After conducting simulation study and empirical study, the result shows that: (1) σ 2 ∼ i n v - g a m m a ( 0.001 , 0.001 ) [prior 1] performs best and more stably in posterior point estimations in most scenarios, while 1 σ 2 ∼ p a r e t o ( 1 , 0.001 ) [prior 6] is always the worst one; (2) The differences among methods are mainly reflected in variance component σ i 2 and σ r 2 and prior 6 has obvious extreme bias values with the maximum value even reaching 281.09 and 167.59; (3) Posterior mean estimations always produce the biggest biases, but posterior median estimations are the best; (4) The differences in estimating variance components between uninformative priors become greater when the number of levels of the variance components is small; (5) The results between full data and 10% missing/sparse data are about the same. The small amount of missing/sparse data has a minimal impact on the results. The running time of eight distributions ranges from 489.78 to 692.58 seconds and does not differ from each other too much.
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.50 × 0.4 = 0.20 |
| M · momentum | 0.50 × 0.15 = 0.07 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.