Prepare for the AWS Certified AI Practitioner Exam with flashcards and multiple choice questions. Each question includes hints and explanations to help you succeed on your test. Get ready for certification!

Practice this question and more.


What type of model should a company use to generate synthetic data based on existing datasets?

  1. Generative adversarial network (GAN)

  2. XGBoost

  3. Residual neural network

  4. WaveNet

The correct answer is: Generative adversarial network (GAN)

A generative adversarial network (GAN) is specifically designed for generating new data that mimics an existing dataset. The architecture of a GAN consists of two components: the generator, which creates synthetic data, and the discriminator, which evaluates the authenticity of the generated data against real data. By iteratively improving both the generator and discriminator, GANs can produce highly realistic synthetic data, making them ideal for applications where additional training data is necessary, or for privacy-preserving purposes where real data cannot be used. Models like XGBoost, while effective for various supervised learning tasks, focus primarily on predictive modeling rather than data generation. Residual neural networks are designed for deep learning tasks that involve complex relationships in data but do not generate synthetic instances. WaveNet is a type of neural network primarily used for audio generation, rather than general-purpose data synthesis. Therefore, the choice of a GAN is the most suitable for creating synthetic data from existing datasets.