Removal
Removal, also known as listwise deletion, is the process of removing data points or rows that have missing values.
How It Works
If a data point or a row in the dataset has one or more missing values, it is removed from the dataset.
Benefits
- Simplicity: It’s a straightforward method that requires no computation.
- No Data Distortion: It does not introduce any artificial bias or distortion.
Limitations
- Data Loss: It can lead to significant data loss if many data points have missing values.
- Bias: It can introduce bias if the missingness is not completely at random.
Features
- Complete Cases: It results in a dataset with complete cases only.
Use Cases
- Small Amount of Missing Data: It’s useful when the dataset has a small amount of missing data.
Mean Imputation
Mean imputation is the process of replacing missing values with the mean value of the available cases.
How It Works
The mean value of the non-missing values is computed and used to fill in the missing values.
Benefits
- Preserves Mean: It preserves the sample mean of the dataset.