Dimensionality Reduction

Dimensionality reduction is a technique used in data science to reduce the number of input variables in a dataset.

How It Works

Dimensionality reduction works by creating new combinations of attributes (Principal Component Analysis, PCA) or by finding attributes that are redundant (like in Autoencoders) and removing them from the dataset.

Benefits

Limitations

Features

Use Cases

Self-Correlation and Removal of Highly Correlated Features

Running self-correlation on all features and removing highly correlated features is a method used to reduce multicollinearity in the dataset.

How It Works

This method works by calculating the correlation between all pairs of features. If the correlation between a pair of features exceeds a certain threshold, one of the features is removed.