Request

by GeeveGeorge - opened Oct 24

Discussion

GeeveGeorge

Oct 24

Great Lora.
The results look super interesting.

I had a request :

Could you make the data prep scripts and instructions public?
Captioning script
Training script, GPU Requirement, VRAM Requirement, How long does it take to complete

It would be amazing if you could push your scripts or folder to github with some instructions.

I would love to try with maybe 1000 videos of different cinematic shots.

hashu786

Owner Oct 24

I basically followed all the instructions on the cogvideox-factory github repo for data prep and training: https://github.com/a-r-r-o-w/cogvideox-factory

They also have their own captioning method on there that I used to caption the videos.

I trained this on a 3090 - 24GB VRAM. takes about 20 hours for 4000 steps.

GeeveGeorge

Oct 26

@hashu786 thank you for the response.
had few more questions :

how long were the video clips used to train?

GeeveGeorge

Oct 26

@hashu786 what exactly did you use for captioning. cogvlm llama 3?

hashu786

Owner Nov 8

clips were 49 frames each and yes I used llama 3 for captioning

maze

27 days ago

clips were 49 frames each and yes I used llama 3 for captioning

great work！can you provide the training data you used？i have difficult on collecting dataset。it will be helpful if youcan provide your dataset directly

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment