AbstractsPsychology

'Mind the Gap': Simulation studies to validate the quality of multiple imputation

by Angela C. Hoelzenbein




Institution: Universität Freiburg
Department: Wirtschafts- und Verhaltenswissenschaftliche Fakultät
Degree: PhD
Year: 2015
Record ID: 1111982
Full text PDF: http://www.freidok.uni-freiburg.de/volltexte/2015/10011/


Abstract

Missing Data is a common problem in empirical research (Enders, 2010) and has been associated with poor data quality and poor results. Unfortunately, many studies fail not only to report the amount of missing data, but fail to do anything about it (Karahalios, Baglietto, Carlin, English, & Simpson, 2012; Rousseau, Simon, Bertrand, & Hachey, 2011). Methods exist to tackle the problem of missing data, though many of the standard methods fail to do so adequately. This is due to different patterns of randomness associated with the missingness, with standard methods being able to deliver unbiased results only with completely random missingness. Regrettably, most statistical software packages use standard methods as a default setting. Although better methods exist to deal with missing patterns other than complete randomness, these are not widely known nor used (Karahalios et al., 2012; Peugh & Enders, 2004; Wood, White, & Thompson, 2004), which might be due to the lack of nonmathematical guidelines in this topic (Enders, 2010). This thesis gives an introduction to the pitfalls of standard methods and introduces modern better methods to deal with missing data. Especially multiple imputation, a method relevant for psychological research, is introduced and discussed. To promote correct application of this method and discover limits for application in psychological research, two simulation studies are conducted. In the first simulation, 1,152,000 data sets that are typical for psychological research (Bakker, van Dijk, & Wicherts, 2012) are created varying five potential influencing factors on multiple imputation performance. The characteristic of missingness in the data sets is also simulated. Quality of estimators relevant for psychology are evaluated after applying multiple imputation on each of the data sets. In the second simulation study, the manipulation of the factors of the first simulation is improved and different multiple imputation algorithms are considered for generalizability, resulting in 1,920,000 data sets. Assessing the results of both simulation studies against the background of preliminary research, limits of the application of multiple imputation in psychological research are discussed and new guidelines introduced. Fehlende Werte sind ein häufiges Problem in empirischen Studien (Enders, 2010) und werden mit schlechter Datenqualität und Ergebnissen in Verbindung gebracht. Bedauerlicherweise unterlassen viele Studien nicht nur, die Menge von fehlenden Werten zu berichten, sondern scheitern auch, etwas dagegen zu unternehmen (Karahalios et al., 2012; Rousseau et al., 2011). Methoden, um das Problem fehlender Werte anzugehen, sind zwar vorhanden, allerdings scheitern viele Standardmethoden, dies adäquat zu tun. Dies begründet sich durch die verschiedenen Zufallsmuster die mit dem Fehlen in Verbindung gebracht werden, wobei Standardmethoden nur unverzerrte Ergebnisse liefern, wenn das Fehlen vollständig zufällig ist. Leider gebraucht die meiste Statistiksoftware Standardmethoden als Voreinstellung.…