A repository about generative AI created in 2023 to host experiments, paper implementations, tutorials and course examples(mainly for universidad Galileo) with the goal of refreshing my memory of some things I did in the past, learning new things and sharing(if possible).
Some things I did in the past(from around 2016-2018):
-
Autoregressive:
- "The simpsons" tv script generator: recurrent neural networks used to generate tv scripts(text) for the "simpsons"
- Edgar Phillips LovePoe(generate stories combining Lovecraft and Poe): recurrent neural networks(LSTM) used to generate stories that mix H.P Lovecraft and Edgar Poe styles.
-
GANs:
- DCGAN for MNIST digits and face generation: deep convolutional GAN applied to the MNIST(to generate digits) dataset and the celeba dataset(to generate faces).
- DCGAN for SVHN(street view house numbers): deep convolutional GAN applied to the SVHN(to generate house numbers) dataset .
- GAN for video/animation generation of "ARRIVAL" movie circles: GAN used to generate the "signs/languaeg" from the movie "ARRIVAL". Some results:
-
Autoencoders
- Simple auto-encoder for MNIST: a vanilla autoencoder using the MNIST dataset
- Convolutional auto-encoder for MNIST : a convolutional autoencoder applied to the MNIST dataset
- Convolutiona auto-encoder for genoising MNIST: a denoising autoencoder applied to the MNIST dataset
- Variational auto-encoder: a variational autoencoder applied to the mnist dataset
-
Autoregressive:
- PixelCNN from Pixel RNNs paper: PixelCNN applied to MNIST from the paper "pixel recurrent neural networks"
- LLM parameter-efficient fine-tuning: LLM fine-tuned using parameter-effficient fine-tuning(LoRA)
-
Basic GAN: just a good old original GAN.
-
Conditional GAN: just a super simple conditional gan using MNIST(given a digit label generate an image of that digit)
-
Wasserstein GAN with gradient penalty for face generation : a WGAN-GP for face generation
-
GAN with adaptive discriminator augmentation and kernel inception distance: GAN with adaptive discriminator augmentation to prevent discriminator from overfiting and kernel inception distance to measure quality of generated images.
-
Multimodal generation(text to image) using VQGAN+CLIP: vector quantized gan + contrastive language-image pretraining for generating images from text.