degvEFd023 |
Posted: Mon 8:25, 18 Apr 2011 Post subject: Missing Data Mechanisms |
|
v class="googleright">As virtually anybody researcher can attest, lacking data are a extensive problem. Data from surveys, experiments, and secondary sources are often missing some data. The impact of the missing data above the results of statistical thinking depends on the machinery which reasoned the data apt be missing and the direction in which the data critic deals with it. This is the premier in a array of three articles namely discusses issues circling missing data. This treatise contours the mechanisms of missing data and some of their impacts. Subsequent articles ambition explain mutual but problematic solutions to missing data, new and better solutions, and the software available because implementing these solutions.
Data are missing for many reasons. Subjects in longitudinal studies often drop out before the learn is completed because they have pushed out of the place, died, no longer look personal behalf to participating, or do no like the effects of the management. Surveys suffer missing data while participants refuse, alternatively do not know the answer to alternatively deliberately skip an item. Some examine researchers even chart the learn so that some answers are inquired of only a subset of participants. Experimental studies have missing data when a researcher is simply unable to gather one reconnaissance. Bad weather conditions may render observation impossible in field experiments. A researcher becomes sick or equipment fails. Data may be missing in any type of study due to perchance or data portal mistake. A researcher drops a tray of test tubes. A data file becomes corrupt. Most researchers are quite versed with 1 (or more) of these positions.
Missing data are problematic because maximum statistical programs require a value for each variable. When a data set is lacking, the data analyst has to decide how to handle with it. The maximum common determination is to use complete circumstance analysis (likewise shrieked listwise deletion)--analyzing only the cases with complete data. Individuals with data missing on any variables are dropped from the analysis. It has advantages--it is cozy to use, is very uncomplicated, and is the default in most statistical packages. But it has constraints. It can substantially lower the example size, leading to a caustic absence of power. This is principally true whether there are many variables contained in the analysis, every with data missing for a few cases. It can also lead to prejudiced results, relying on why the data are missing.
All of the causes for missing data eligible into 4 classes, which are based on the relationship among the missing data mechanism and the missing and inspected amounts. These classes are momentous to understand for the problems caused along missing data and the solutions to these problems are another for the four classes.
The first is Missing Completely by Random (MCAR). MCAR means that the missing data mechanism is unrelated to the values of any variables, if missing or observed. Data that are missing because a researcher dropped the test tubes or survey participants accidentally skipped questions are likely to be MCAR. If the observed values are essentially a random sample of the full data set, complete case analysis gives the same results as the full data set would have. Unfortunately, most missing data are not MCAR.
At the opposite end of the spectrum namely No |
|