ML0017 Data Augmentation

What are the common data augmentation techniques?

Answer

Data augmentation refers to techniques used to increase the diversity and size of a training dataset by creating modified versions of the existing data. It’s especially popular in applications like computer vision and natural language processing, where collecting large datasets can be expensive or time-consuming.

Common Techniques:
Computer Vision:

Geometric Transformations: Rotate, flip, crop, or scale images
Color Adjustments: Change brightness, contrast, saturation, or apply color jittering.
Noise Injection: Add random noise or blur to images.

Natural Language Processing:
Synonym Replacement: Replace words with their synonyms.
Back Translation: Translate text to another language and back.
Random Insertion/Deletion: Add/remove words randomly.

Tabular Data:
SMOTE (Synthetic Minority Oversampling Technique): Generate synthetic data points for minority classes.
Noise Injection: Add small random noise to numeric features.


Login to view more content

Did you solve the problem?

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *