From: Tabular and latent space synthetic data generation: a literature review
Reference | Data type | ML problem | Domain | Observations |
---|---|---|---|---|
[4] | — | Data privacy | Finance | Analysis of applications, motivation and properties of synthetic data for anonymization. |
[20] | Tabular | Data privacy | Healthcare | Focus on GANs. |
[21] | Tabular | Data privacy | Statistics | Focus on general definitions such as differential privacy and statistical disclosure control. |
[22] | Tabular | Imbalanced learning | Various | Focus on oversampling with GANs in cybersecurity and finance. |
[24] | Text | Classification | — | Distinguish 100 methods into 12 groups. |
[25] | Text | Deep learning | — | General overview of text data augmentation. |
[26] | Text | Few-shot learning | — | Augmentation techniques for machine learning with limited data |
[14] | Text | — | — | Overview of augmentation techniques and applications on NLP tasks. |
[27] | Text | — | Various | Analysis of industry use cases of data augmentation in NLP. Emphasis on input level data augmentation. |
[23] | Image | Segmentation | Medicine | Analysis of algorithmic applications on a 2018 brain-tumor segmentation challenge. |
[28] | Image | Imbalanced learning | — | Emphasis on GANs. |
[13] | Image | — | Medicine | Emphasis on GANs. |
[29] | Image | Deep learning | — | Regularization techniques using facial image data. Emphasis on Deep Learning generative models. |
[30] | Image | Deep learning | — | Emphasis on data augmentation as a regularization technique. |
[31] | Image | — | — | Broad overview of image data augmentation. Emphasis on traditional approaches. |
[32] | Image | — | Various | General overview of image data augmentation and relevant domains of application. |
[33] | Time series | Classification | — | Defined a taxonomy for time series data augmentation. |
[34] | Time series | Various | — | Analysis of data augmentation methods for classification, anomaly detection and forecasting. |
[35] | Graph | Various | — | Graph data augmentation for supervised and self-supervised learning. |