Request

#1
by GeeveGeorge - opened

Great Lora.
The results look super interesting.

I had a request :

  1. Could you make the data prep scripts and instructions public?
  2. Captioning script
  3. Training script, GPU Requirement, VRAM Requirement, How long does it take to complete

It would be amazing if you could push your scripts or folder to github with some instructions.

I would love to try with maybe 1000 videos of different cinematic shots.

I basically followed all the instructions on the cogvideox-factory github repo for data prep and training: https://github.com/a-r-r-o-w/cogvideox-factory

They also have their own captioning method on there that I used to caption the videos.

I trained this on a 3090 - 24GB VRAM. takes about 20 hours for 4000 steps.

@hashu786 thank you for the response.
had few more questions :

how long were the video clips used to train?

@hashu786 what exactly did you use for captioning. cogvlm llama 3?

Owner

clips were 49 frames each and yes I used llama 3 for captioning

clips were 49 frames each and yes I used llama 3 for captioning

great work!can you provide the training data you used?i have difficult on collecting dataset。it will be helpful if youcan provide your dataset directly

Sign up or log in to comment