Duplicated from zmbfeng/text_to_speech_sync_video
Temporary files at the time of inference/testing will be saved here. You can ignore them.