desuAnon commited on
Commit
92c295d
·
verified ·
1 Parent(s): 2d6682d
Files changed (1) hide show
  1. README.md +24 -20
README.md CHANGED
@@ -1,41 +1,45 @@
1
  ---
2
  license: cc0-1.0
3
  ---
 
4
 
5
- On 2024-11-26, temporary access to OpenAI's video generation model Sora (turbo) was granted via this Hugging Face repository:
 
 
6
 
7
- https://huggingface.co/spaces/PR-Puppets/PR-Puppet-Sora
 
 
 
8
 
9
- After a few hours, OpenAI invalidated the API key used by the repo and removed access to the generated videos.
10
- This release consists of 87 videos (~702 MB) and 83 corresponding prompts that were archived, from the publicly displayed generations, in anticipation of that event.
11
- Not all videos generated were able to be archived, due to HF load issues. All videos are of MIME type video/mp4 and have a framerate of 30 FPS.
12
- The generation parameters may be found in the app.py of the original repo [here](https://huggingface.co/spaces/PR-Puppets/PR-Puppet-Sora/blob/main/app.py); an archive of this file may be found [here](https://archive.is/r70Ao).
13
-
14
- The Sora backend that was used for generation was the following:
15
- `https://sora.openai.com/backend/video_gen`
16
-
17
- Please note that user prompts are often "augmented" (changed by some LLM) before generating videos, so the prompts listed may not be the exact one used by the model.
18
- The prompt used for four videos are not known, and these are denoted as [unknown_n].
19
 
20
  ---
21
  ### Archive versions
22
 
 
 
 
 
23
 
24
  **sora-turbo-vids.zip**
25
- This was the original upload, and had some encoding/compatibility issues for some users.
26
- The "short" video filenames are the full original prompts used for the API request for each video.
27
- A "long" prompt limit was based off the filename length limit (around 255 B) for Windows/macOS/Linux.
28
- All short prompts are used as filenames in the "short_prompts" directory.
29
- The ten longer prompts in "full_long_prompts.txt" were used for the videos in the "long_prompts" directory.
30
 
31
-
32
- **videos_only.zip** and **videos_only.7z**
33
  These identical archives (in different compression formats) contain only the original videos, with names such as `video_24.mp4`.
34
  The `video_24` part is the video ID, and the prompt used for a specific video ID is listed in the separate CSV and JSONL files (video_id, prompt).
35
- You should be able to easily view both those files in a text editor, and they are easy to import and process in various programming languages.
36
 
37
  ---
38
 
 
 
39
  ~ desuAnon
40
 
41
  https://rentry.org/desuAnon
 
1
  ---
2
  license: cc0-1.0
3
  ---
4
+ ### Release Information
5
 
6
+ Temporary access to OpenAI's video generation model Sora (turbo) was provided by the HF repo [PR-Puppet-Sora](https://huggingface.co/spaces/PR-Puppets/PR-Puppet-Sora), on November 26th.
7
+ After a few hours, OpenAI revoked the API key used by the repo and removed access to the generated videos.
8
+ In anticipation of that event, the publicly displayed videos and their prompts were archived.
9
 
10
+ This release contains 87 archived videos (~702 MB) and 83 of their prompts, and dedicated to the public domain (CC0 1.0 Universal).
11
+ The generation parameters may be found in the app.py of the original repo [here](https://huggingface.co/spaces/PR-Puppets/PR-Puppet-Sora/blob/main/app.py). An archive of this script is available [here](https://archive.is/r70Ao).
12
+ User prompts are often "augmented" (changed by some LLM) before generating videos, and this may be true for these videos as well.
13
+ The Sora backend that was used for generation was `https://sora.openai.com/backend/video_gen`
14
 
15
+ Contrary to claims online, the generations were *not* uncensored. User prompts, as well as the generated videos, passed through OpenAI's content moderation normally.
16
+ This is partly the reason why none of the videos in this archive are NSFW, or similar, despite a few *brave attempts* in the prompts.
17
+ It is also incorrect that "Sora leaked", since the model itself (its model parameters) had not been acquired by outsiders.
18
+ The only thing that "leaked" was previewer/beta tester access to Sora video generation, via a single HF repo - while keeping its API keys secret.
 
 
 
 
 
 
19
 
20
  ---
21
  ### Archive versions
22
 
23
+ All videos are `.mp4`, of varying resolutions, and a framerate of 30 FPS.
24
+ Not all of the videos that were generated were able to be archived, due to HF server load issues.
25
+ The prompts used for four videos are not known, and these are denoted as [unknown_n].
26
+ Hugging Face performs *File Security Scans* of uploaded files, and you can click on the icon next to each file to see the result of this.
27
 
28
  **sora-turbo-vids.zip**
29
+ This is the original archive containing both videos and their prompts, and some users experienced encoding/compatibility issues with it.
30
+ Consider using the more recent "separated" uploads if you encounter similar issues.
31
+ The filenames in the `short_prompts` directory are the full prompts used for each video generation request.
32
+ The filenames in the `long_prompts` directory are shortened versions of the long prompts (above 256 chars), and their full versions are found in `full_long_prompts.txt`.
 
33
 
34
+ **videos_only.zip** & **videos_only.7z**
 
35
  These identical archives (in different compression formats) contain only the original videos, with names such as `video_24.mp4`.
36
  The `video_24` part is the video ID, and the prompt used for a specific video ID is listed in the separate CSV and JSONL files (video_id, prompt).
37
+ You may easily view both those files in a text editor, and they are easy to import and process in various programming languages.
38
 
39
  ---
40
 
41
+ Even though this is a *dataset* upload, I went with a *model* repo because a) the URL is shorter, and b) the original upload wasn't compatible with the HF dataset viewer.
42
+
43
  ~ desuAnon
44
 
45
  https://rentry.org/desuAnon