--- license: mit datasets: - Iker/GTAV-Driving-Dataset base_model: - Etched/oasis-500m tags: - video --- # AI Generated GTA V A Deep Learning project that uses Diffusion transformers (DiT) to generate and Theft Auto V driving footage. This project is based on the [Open-Oasis Project](https://github.com/etched-ai/open-oasis) Please see the GitHub repo for more info: https://github.com/ikergarcia1996/AI-Generated-GTAV ### dit.safetensors - Trained using 4xNvidia A100 80Gb in Bfloat16 - 64 batch size - 1e-4 learning rate with constant scheduler and 5% warmup - 1,610,000 steps - ddim_noise_steps 50 - ctx noise increased from 0 to 40 during the first 50% of the training. Set to 40 during the remaining steps. - No Action conditioning ### dit_action.safetensors - Continue training of `dit.safetensors` with `action_conditioning` for 210,000 steps. - Trained using 4xNvidia A100 80Gb in Bfloat16 - 64 batch size - 1e-4 learning rate with cosine scheduler to 1e-5 and 5% warmup