Request
Great Lora.
The results look super interesting.
I had a request :
- Could you make the data prep scripts and instructions public?
- Captioning script
- Training script, GPU Requirement, VRAM Requirement, How long does it take to complete
It would be amazing if you could push your scripts or folder to github with some instructions.
I would love to try with maybe 1000 videos of different cinematic shots.
I basically followed all the instructions on the cogvideox-factory github repo for data prep and training: https://github.com/a-r-r-o-w/cogvideox-factory
They also have their own captioning method on there that I used to caption the videos.
I trained this on a 3090 - 24GB VRAM. takes about 20 hours for 4000 steps.
@hashu786
thank you for the response.
had few more questions :
how long were the video clips used to train?
@hashu786 what exactly did you use for captioning. cogvlm llama 3?
clips were 49 frames each and yes I used llama 3 for captioning
clips were 49 frames each and yes I used llama 3 for captioning
great work!can you provide the training data you used?i have difficult on collecting dataset。it will be helpful if youcan provide your dataset directly