Latent variable mixture models to test for differential item functioning: a population-based analysis

Xiuyun Wu, Richard Sawatzky, Wilma Hopman, Nancy Mayo, Tolulope T Sajobi, Juxin Liu, Jerilynn Prior, Alexandra Papaioannou, Robert G Josse, Tanveer Towheed, K Shawn Davison, Lisa M Lix

doi:10.1186/s12955-017-0674-0

Latent variable mixture models to test for differential item functioning: a population-based analysis

Xiuyun Wu et al., 2017

Health and quality of life outcomes, 15(1), 102

DOI 10.1186/s12955-017-0674-0 PMID 28506313 Source

Abstract

BackgroundComparisons of population health status using self-report measures such as the SF-36 rest on the assumption that the measured items have a common interpretation across sub-groups. However, self-report measures may be sensitive to differential item functioning (DIF), which occurs when sub-groups with the same underlying health status have a different probability of item response. This study tested for DIF on the SF-36 physical functioning (PF) and mental health (MH) sub-scales in population-based data using latent variable mixture models (LVMMs).MethodsData were from the Canadian Multicentre Osteoporosis Study (CaMos), a prospective national cohort study. LVMMs were applied to the ten PF and five MH SF-36 items. A standard two-parameter graded response model with one latent class was compared to multi-class LVMMs. Multivariable logistic regression models with pseudo-class random draws characterized the latent classes on demographic and health variables.ResultsThe CaMos cohort consisted of 9423 respondents. A three-class LVMM fit the PF sub-scale, with class proportions of 0.59, 0.24, and 0.17. For the MH sub-scale, a two-class model fit the data, with class proportions of 0.69 and 0.31. For PF items, the probabilities of reporting greater limitations were consistently higher in classes 2 and 3 than class 1. For MH items, respondents in class 2 reported more health problems than in class 1. Differences in item thresholds and factor loadings between one-class and multi-class models were observed for both sub-scales. Demographic and health variables were associated with class membership.ConclusionsThis study revealed DIF in population-based SF-36 data; the results suggest that PF and MH sub-scale scores may not be comparable across sub-groups defined by demographic and health status variables, although effects were frequently small to moderate in size. Evaluation of DIF should be a routine step when analysing population-based self-report data to ensure valid comparisons amongst sub-groups.

Topics

General OB/GYN > Other > Psychometric Methods

differential item functioning SF-36, quality of life measurement bias, latent variable mixture models health surveys, psychometric validity population surveys, SF-36 physical functioning mental health, self-report measure comparability, health status measurement methods, survey response bias detection

Cite this article

Wu, X. Y., Sawatzky, R., Hopman, W., Mayo, N., Sajobi, T., Liu, J., Prior, J. C., Alexandra Papaioannou, Josse, R., Towheed, T., Davison, K. S., & Lix, L. (1900). Latent variable mixture models to test for differential item functioning: a population-based analysis. *Health and quality of life outcomes*, *15*(1), 102. https://doi.org/10.1186/s12955-017-0674-0