motexture
/

caT-text-to-video

Model card Files Files and versions Community

caT-text-to-video / README.md

motexture's picture

Update README.md

67b6b06 verified 9 days ago

|

1.63 kB

	---
	license: apache-2.0
	datasets:
	- TempoFunk/webvid-10M
	language:
	- en
	tags:
	- text-to-video
	base_model:
	- ali-vilab/text-to-video-ms-1.7b
	---
	# caT text to video

	Conditionally augmented text-to-video model. Uses pre-trained weights from modelscope text-to-video model, augmented with temporal conditioning transformers to extend generated clips and create a smooth transition between them.
	Supports prompt interpolation as well to change scenes during clip extensions.

	This model was trained at home as a hobby.

	Do not expect high quality samples.

	## Installation

	### Clone the Repository

	```bash
	git clone https://github.com/motexture/caT-text-to-video.git
	cd caT-text-to-video
	python3 -m venv venv
	source venv/bin/activate # On Windows use `venv\Scripts\activate`
	pip install -r requirements.txt
	python3 run.py
	```

	Visit the provided URL in your browser to interact with the interface and start generating videos.

	Note: Ensure that you are on the latest commit, as the positional encodings have been updated compared to the initial models.

	<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/64a86f7d03835e13f95c3687/qr-NXxvmkquF_mMlx_5P-.mp4"></video>
	<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/64a86f7d03835e13f95c3687/32B1RPHAmieomeXWp2XvC.mp4"></video>
	<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/64a86f7d03835e13f95c3687/40KrBvzMf8DmPO8VvATfC.mp4"></video>
	<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/64a86f7d03835e13f95c3687/SEtFOILcwwNT4M8mXMNWt.mp4"></video>