chansung commited on
Commit
a1ca2de
·
1 Parent(s): d36279f
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2023 coding-pot
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
LICENSE-CreativeML ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Copyright (c) 2022 Robin Rombach and Patrick Esser and contributors
2
+
3
+ CreativeML Open RAIL-M
4
+ dated August 22, 2022
5
+
6
+ Section I: PREAMBLE
7
+
8
+ Multimodal generative models are being widely adopted and used, and have the potential to transform the way artists, among other individuals, conceive and benefit from AI or ML technologies as a tool for content creation.
9
+
10
+ Notwithstanding the current and potential benefits that these artifacts can bring to society at large, there are also concerns about potential misuses of them, either due to their technical limitations or ethical considerations.
11
+
12
+ In short, this license strives for both the open and responsible downstream use of the accompanying model. When it comes to the open character, we took inspiration from open source permissive licenses regarding the grant of IP rights. Referring to the downstream responsible use, we added use-based restrictions not permitting the use of the Model in very specific scenarios, in order for the licensor to be able to enforce the license in case potential misuses of the Model may occur. At the same time, we strive to promote open and responsible research on generative models for art and content generation.
13
+
14
+ Even though downstream derivative versions of the model could be released under different licensing terms, the latter will always have to include - at minimum - the same use-based restrictions as the ones in the original license (this license). We believe in the intersection between open and responsible AI development; thus, this License aims to strike a balance between both in order to enable responsible open-science in the field of AI.
15
+
16
+ This License governs the use of the model (and its derivatives) and is informed by the model card associated with the model.
17
+
18
+ NOW THEREFORE, You and Licensor agree as follows:
19
+
20
+ 1. Definitions
21
+
22
+ - "License" means the terms and conditions for use, reproduction, and Distribution as defined in this document.
23
+ - "Data" means a collection of information and/or content extracted from the dataset used with the Model, including to train, pretrain, or otherwise evaluate the Model. The Data is not licensed under this License.
24
+ - "Output" means the results of operating a Model as embodied in informational content resulting therefrom.
25
+ - "Model" means any accompanying machine-learning based assemblies (including checkpoints), consisting of learnt weights, parameters (including optimizer states), corresponding to the model architecture as embodied in the Complementary Material, that have been trained or tuned, in whole or in part on the Data, using the Complementary Material.
26
+ - "Derivatives of the Model" means all modifications to the Model, works based on the Model, or any other model which is created or initialized by transfer of patterns of the weights, parameters, activations or output of the Model, to the other model, in order to cause the other model to perform similarly to the Model, including - but not limited to - distillation methods entailing the use of intermediate data representations or methods based on the generation of synthetic data by the Model for training the other model.
27
+ - "Complementary Material" means the accompanying source code and scripts used to define, run, load, benchmark or evaluate the Model, and used to prepare data for training or evaluation, if any. This includes any accompanying documentation, tutorials, examples, etc, if any.
28
+ - "Distribution" means any transmission, reproduction, publication or other sharing of the Model or Derivatives of the Model to a third party, including providing the Model as a hosted service made available by electronic or other remote means - e.g. API-based or web access.
29
+ - "Licensor" means the copyright owner or entity authorized by the copyright owner that is granting the License, including the persons or entities that may have rights in the Model and/or distributing the Model.
30
+ - "You" (or "Your") means an individual or Legal Entity exercising permissions granted by this License and/or making use of the Model for whichever purpose and in any field of use, including usage of the Model in an end-use application - e.g. chatbot, translator, image generator.
31
+ - "Third Parties" means individuals or legal entities that are not under common control with Licensor or You.
32
+ - "Contribution" means any work of authorship, including the original version of the Model and any modifications or additions to that Model or Derivatives of the Model thereof, that is intentionally submitted to Licensor for inclusion in the Model by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Model, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution."
33
+ - "Contributor" means Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Model.
34
+
35
+ Section II: INTELLECTUAL PROPERTY RIGHTS
36
+
37
+ Both copyright and patent grants apply to the Model, Derivatives of the Model and Complementary Material. The Model and Derivatives of the Model are subject to additional terms as described in Section III.
38
+
39
+ 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare, publicly display, publicly perform, sublicense, and distribute the Complementary Material, the Model, and Derivatives of the Model.
40
+ 3. Grant of Patent License. Subject to the terms and conditions of this License and where and as applicable, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this paragraph) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Model and the Complementary Material, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Model to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Model and/or Complementary Material or a Contribution incorporated within the Model and/or Complementary Material constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for the Model and/or Work shall terminate as of the date such litigation is asserted or filed.
41
+
42
+ Section III: CONDITIONS OF USAGE, DISTRIBUTION AND REDISTRIBUTION
43
+
44
+ 4. Distribution and Redistribution. You may host for Third Party remote access purposes (e.g. software-as-a-service), reproduce and distribute copies of the Model or Derivatives of the Model thereof in any medium, with or without modifications, provided that You meet the following conditions:
45
+ Use-based restrictions as referenced in paragraph 5 MUST be included as an enforceable provision by You in any type of legal agreement (e.g. a license) governing the use and/or distribution of the Model or Derivatives of the Model, and You shall give notice to subsequent users You Distribute to, that the Model or Derivatives of the Model are subject to paragraph 5. This provision does not apply to the use of Complementary Material.
46
+ You must give any Third Party recipients of the Model or Derivatives of the Model a copy of this License;
47
+ You must cause any modified files to carry prominent notices stating that You changed the files;
48
+ You must retain all copyright, patent, trademark, and attribution notices excluding those notices that do not pertain to any part of the Model, Derivatives of the Model.
49
+ You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions - respecting paragraph 4.a. - for use, reproduction, or Distribution of Your modifications, or for any such Derivatives of the Model as a whole, provided Your use, reproduction, and Distribution of the Model otherwise complies with the conditions stated in this License.
50
+ 5. Use-based restrictions. The restrictions set forth in Attachment A are considered Use-based restrictions. Therefore You cannot use the Model and the Derivatives of the Model for the specified restricted uses. You may use the Model subject to this License, including only for lawful purposes and in accordance with the License. Use may include creating any content with, finetuning, updating, running, training, evaluating and/or reparametrizing the Model. You shall require all of Your users who use the Model or a Derivative of the Model to comply with the terms of this paragraph (paragraph 5).
51
+ 6. The Output You Generate. Except as set forth herein, Licensor claims no rights in the Output You generate using the Model. You are accountable for the Output you generate and its subsequent uses. No use of the output can contravene any provision as stated in the License.
52
+
53
+ Section IV: OTHER PROVISIONS
54
+
55
+ 7. Updates and Runtime Restrictions. To the maximum extent permitted by law, Licensor reserves the right to restrict (remotely or otherwise) usage of the Model in violation of this License, update the Model through electronic means, or modify the Output of the Model based on updates. You shall undertake reasonable efforts to use the latest version of the Model.
56
+ 8. Trademarks and related. Nothing in this License permits You to make use of Licensors’ trademarks, trade names, logos or to otherwise suggest endorsement or misrepresent the relationship between the parties; and any rights not expressly granted herein are reserved by the Licensors.
57
+ 9. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Model and the Complementary Material (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Model, Derivatives of the Model, and the Complementary Material and assume any risks associated with Your exercise of permissions under this License.
58
+ 10. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Model and the Complementary Material (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages.
59
+ 11. Accepting Warranty or Additional Liability. While redistributing the Model, Derivatives of the Model and the Complementary Material thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability.
60
+ 12. If any provision of this License is held to be invalid, illegal or unenforceable, the remaining provisions shall be unaffected thereby and remain valid as if such provision had not been set forth herein.
61
+
62
+ END OF TERMS AND CONDITIONS
63
+
64
+
65
+
66
+
67
+ Attachment A
68
+
69
+ Use Restrictions
70
+
71
+ You agree not to use the Model or Derivatives of the Model:
72
+ - In any way that violates any applicable national, federal, state, local or international law or regulation;
73
+ - For the purpose of exploiting, harming or attempting to exploit or harm minors in any way;
74
+ - To generate or disseminate verifiably false information and/or content with the purpose of harming others;
75
+ - To generate or disseminate personal identifiable information that can be used to harm an individual;
76
+ - To defame, disparage or otherwise harass others;
77
+ - For fully automated decision making that adversely impacts an individual’s legal rights or otherwise creates or modifies a binding, enforceable obligation;
78
+ - For any use intended to or which has the effect of discriminating against or harming individuals or groups based on online or offline social behavior or known or predicted personal or personality characteristics;
79
+ - To exploit any of the vulnerabilities of a specific group of persons based on their age, social, physical or mental characteristics, in order to materially distort the behavior of a person pertaining to that group in a manner that causes or is likely to cause that person or another person physical or psychological harm;
80
+ - For any use intended to or which has the effect of discriminating against individuals or groups based on legally protected characteristics or categories;
81
+ - To provide medical advice and medical results interpretation;
82
+ - To generate or disseminate information for the purpose to be used for administration of justice, law enforcement, immigration or asylum processes, such as predicting an individual will commit fraud/crime commitment (e.g. by text profiling, drawing causal relationships between assertions made in documents, indiscriminate and arbitrarily-targeted use).
LICENSE-OFL ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Copyright 2023 The Lugrasimo Project Authors (https://github.com/docrepair-fonts/lugrasimo-fonts).
2
+
3
+ This Font Software is licensed under the SIL Open Font License, Version 1.1.
4
+ This license is copied below, and is also available with a FAQ at:
5
+ http://scripts.sil.org/OFL
6
+
7
+
8
+ -----------------------------------------------------------
9
+ SIL OPEN FONT LICENSE Version 1.1 - 26 February 2007
10
+ -----------------------------------------------------------
11
+
12
+ PREAMBLE
13
+ The goals of the Open Font License (OFL) are to stimulate worldwide
14
+ development of collaborative font projects, to support the font creation
15
+ efforts of academic and linguistic communities, and to provide a free and
16
+ open framework in which fonts may be shared and improved in partnership
17
+ with others.
18
+
19
+ The OFL allows the licensed fonts to be used, studied, modified and
20
+ redistributed freely as long as they are not sold by themselves. The
21
+ fonts, including any derivative works, can be bundled, embedded,
22
+ redistributed and/or sold with any software provided that any reserved
23
+ names are not used by derivative works. The fonts and derivatives,
24
+ however, cannot be released under any other type of license. The
25
+ requirement for fonts to remain under this license does not apply
26
+ to any document created using the fonts or their derivatives.
27
+
28
+ DEFINITIONS
29
+ "Font Software" refers to the set of files released by the Copyright
30
+ Holder(s) under this license and clearly marked as such. This may
31
+ include source files, build scripts and documentation.
32
+
33
+ "Reserved Font Name" refers to any names specified as such after the
34
+ copyright statement(s).
35
+
36
+ "Original Version" refers to the collection of Font Software components as
37
+ distributed by the Copyright Holder(s).
38
+
39
+ "Modified Version" refers to any derivative made by adding to, deleting,
40
+ or substituting -- in part or in whole -- any of the components of the
41
+ Original Version, by changing formats or by porting the Font Software to a
42
+ new environment.
43
+
44
+ "Author" refers to any designer, engineer, programmer, technical
45
+ writer or other person who contributed to the Font Software.
46
+
47
+ PERMISSION & CONDITIONS
48
+ Permission is hereby granted, free of charge, to any person obtaining
49
+ a copy of the Font Software, to use, study, copy, merge, embed, modify,
50
+ redistribute, and sell modified and unmodified copies of the Font
51
+ Software, subject to the following conditions:
52
+
53
+ 1) Neither the Font Software nor any of its individual components,
54
+ in Original or Modified Versions, may be sold by itself.
55
+
56
+ 2) Original or Modified Versions of the Font Software may be bundled,
57
+ redistributed and/or sold with any software, provided that each copy
58
+ contains the above copyright notice and this license. These can be
59
+ included either as stand-alone text files, human-readable headers or
60
+ in the appropriate machine-readable metadata fields within text or
61
+ binary files as long as those fields can be easily viewed by the user.
62
+
63
+ 3) No Modified Version of the Font Software may use the Reserved Font
64
+ Name(s) unless explicit written permission is granted by the corresponding
65
+ Copyright Holder. This restriction only applies to the primary font name as
66
+ presented to the users.
67
+
68
+ 4) The name(s) of the Copyright Holder(s) or the Author(s) of the Font
69
+ Software shall not be used to promote, endorse or advertise any
70
+ Modified Version, except to acknowledge the contribution(s) of the
71
+ Copyright Holder(s) and the Author(s) or with their explicit written
72
+ permission.
73
+
74
+ 5) The Font Software, modified or unmodified, in part or in whole,
75
+ must be distributed entirely under this license, and must not be
76
+ distributed under any other license. The requirement for fonts to
77
+ remain under this license does not apply to any document created
78
+ using the Font Software.
79
+
80
+ TERMINATION
81
+ This license becomes null and void if any of the above conditions are
82
+ not met.
83
+
84
+ DISCLAIMER
85
+ THE FONT SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
86
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OF
87
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT
88
+ OF COPYRIGHT, PATENT, TRADEMARK, OR OTHER RIGHT. IN NO EVENT SHALL THE
89
+ COPYRIGHT HOLDER BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
90
+ INCLUDING ANY GENERAL, SPECIAL, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL
91
+ DAMAGES, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
92
+ FROM, OUT OF THE USE OR INABILITY TO USE THE FONT SOFTWARE OR FROM
93
+ OTHER DEALINGS IN THE FONT SOFTWARE.
README.md CHANGED
@@ -1,13 +1,66 @@
1
- ---
2
- title: Zero2story
3
- emoji: 💻
4
- colorFrom: indigo
5
- colorTo: green
6
- sdk: gradio
7
- sdk_version: 3.46.0
8
- app_file: app.py
9
- pinned: false
10
- license: mit
11
- ---
12
-
13
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Zero2Story
2
+
3
+ ![](assets/overview.png)
4
+
5
+ Zero2Story is a framework built on top of [PaLM API](https://developers.generativeai.google), [Stable Diffusion](https://en.wikipedia.org/wiki/Stable_Diffusion), [MusicGen](https://audiocraft.metademolab.com/musicgen.html) for ordinary people to create their own stories. This framework consists of the **background setup**, **character setup**, and **interative story generation** phases.
6
+
7
+ **1. Background setup**: In this phase, users can setup the genre, place, and mood of the story. Especially, genre is the key that others are depending on.
8
+
9
+ **2. Character setup**: In this phase, users can setup characters up to four. For each character, users can decide their characteristics and basic information such as name, age, MBTI, and personality. Also, the image of each character could be generated based on the information using Stable Diffusion.
10
+ - PaLM API translates the given character information into a list of keywords that Stable Diffusion could effectively understands.
11
+ - Then, Stable Diffusion generates images using the keywords as a prompt.
12
+
13
+ **3. Interactive story generation:**: In this phase, the first few paragraphs are generated solely based on the information from the background and character setup phases. Afterwards, users could choose a direction from the given three options that PaLM API generated. Then, further stories are generated based on users' choice. This cycle of choosing an option and generating further stories are interatively continued until users decides to stop.
14
+ - In each story generation, users also could generate background images and music that describe each scene using Stable Diffusion and MusicGen.
15
+ - If the generated story, options, image, and music in each turn, users could ask to re-generate them.
16
+
17
+ ## Prerequisites
18
+
19
+ ### PaLM API key
20
+
21
+ This project heavily depends on [PaLM API](https://developers.generativeai.google). If you want to run it on your own environment, you need to get [PaLM API key](https://developers.generativeai.google/tutorials/setup) and paste it in `.palm_api_key.txt` file within the root directory.
22
+
23
+ ### Packages
24
+
25
+ Make sure you have installed all of the following prerequisites on your development machine:
26
+ * CUDA Toolkit 11.8 with cuDNN 8 - [Download & Install CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit) It is highly recommended to run on a GPU. If you run it in a CPU environment, it will be very slow.
27
+ * Poetry - [Download & Install Poetry](https://python-poetry.org/docs/#installation) It is the python packaging and dependency manager.
28
+ * SQLite3 v3.37.2 or higher - It is required to be installed due to dependencies.
29
+ - Ubuntu 22.04 and later
30
+ ```shell
31
+ $ sudo apt install libc6 sqlite3 libsqlite3
32
+ ```
33
+ - Ubuntu 20.04
34
+ ```shell
35
+ $ sudo sh -c 'cat <<EOF >> /etc/apt/sources.list
36
+ deb http://archive.ubuntu.com/ubuntu/ jammy main
37
+ deb http://security.ubuntu.com/ubuntu/ jammy-security main
38
+ EOF'
39
+ $ sudo apt update
40
+ $ sudo apt install libc6 sqlite3 libsqlite3
41
+ ```
42
+ * FFmpeg (Optional) - Installing FFmpeg enables local video mixing, which in turn generates results more quickly than [other methods](https://huggingface.co/spaces/fffiloni/animated-audio-visualizer)
43
+ ```shell
44
+ $ sudo apt install ffmpeg
45
+
46
+ ## Run
47
+
48
+ ```shell
49
+ $ poetry install
50
+ $ poetry run python app.py
51
+ ```
52
+
53
+ ## Todo
54
+
55
+ - [ ] Exporting of generated stories as PDF
56
+
57
+
58
+ ## Stable Diffusion Model Information
59
+
60
+ ### Checkpoints
61
+ - For character image generation: [CIVIT.AI Model 129896](https://civitai.com/models/129896)
62
+ - For background image generation: [CIVIT.AI Model 93931](https://civitai.com/models/93931?modelVersionId=148652)
63
+
64
+ ### VAEs
65
+ - For character image generation: [CIVIT.AI Model 23906](https://civitai.com/models/23906)
66
+ - For background image generation: [CIVIT.AI Model 65728](https://civitai.com/models/65728)
app.py CHANGED
@@ -1,7 +1,688 @@
 
 
 
1
  import gradio as gr
2
 
3
- def greet(name):
4
- return "Hello " + name + "!!"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
 
6
- iface = gr.Interface(fn=greet, inputs="text", outputs="text")
7
- iface.launch()
 
1
+ import copy
2
+ import random
3
+
4
  import gradio as gr
5
 
6
+ from constants.css import STYLE
7
+ from constants.init_values import (
8
+ genres, places, moods, jobs, ages, mbtis, random_names, personalities, default_character_images, styles
9
+ )
10
+ from constants import desc
11
+
12
+ from interfaces import (
13
+ ui, chat_ui, story_gen_ui, view_change_ui
14
+ )
15
+ from modules.palmchat import GradioPaLMChatPPManager
16
+
17
+ with gr.Blocks(css=STYLE) as demo:
18
+ chat_mode = gr.State("plot_chat")
19
+
20
+ chat_state = gr.State({
21
+ "ppmanager_type": GradioPaLMChatPPManager(),
22
+ "plot_chat": GradioPaLMChatPPManager(),
23
+ "story_chat": GradioPaLMChatPPManager(),
24
+ "export_chat": GradioPaLMChatPPManager(),
25
+ })
26
+
27
+ cur_cursor = gr.State(0)
28
+ cursors = gr.State([])
29
+
30
+ gallery_images1 = gr.State(default_character_images)
31
+ gallery_images2 = gr.State(default_character_images)
32
+ gallery_images3 = gr.State(default_character_images)
33
+ gallery_images4 = gr.State(default_character_images)
34
+
35
+ with gr.Column(visible=True) as pre_phase:
36
+ gr.Markdown("# 📖 Zero2Story", elem_classes=["markdown-center"])
37
+ gr.Markdown(desc.pre_phase_description, elem_classes=["markdown-justify"])
38
+ pre_to_setup_btn = gr.Button("create a custom story", elem_classes=["wrap", "control-button"])
39
+
40
+ with gr.Column(visible=False) as background_setup_phase:
41
+ gr.Markdown("# 🌐 World setup", elem_classes=["markdown-center"])
42
+ gr.Markdown(desc.background_setup_phase_description, elem_classes=["markdown-justify"])
43
+ with gr.Row():
44
+ with gr.Column():
45
+ genre_dd = gr.Dropdown(label="genre", choices=genres, value=genres[0], interactive=True, elem_classes=["center-label"])
46
+ with gr.Column():
47
+ place_dd = gr.Dropdown(label="place", choices=places["Middle Ages"], value=places["Middle Ages"][0], allow_custom_value=True, interactive=True, elem_classes=["center-label"])
48
+ with gr.Column():
49
+ mood_dd = gr.Dropdown(label="mood", choices=moods["Middle Ages"], value=moods["Middle Ages"][0], allow_custom_value=True, interactive=True, elem_classes=["center-label"])
50
+
51
+ with gr.Row():
52
+ back_to_pre_btn = gr.Button("← back", elem_classes=["wrap", "control-button"], scale=1)
53
+ world_setup_confirm_btn = gr.Button("character setup →", elem_classes=["wrap", "control-button"], scale=2)
54
+
55
+ with gr.Column(visible=False) as character_setup_phase:
56
+ gr.Markdown("# 👥 Character setup")
57
+ gr.Markdown(desc.character_setup_phase_description, elem_classes=["markdown-justify"])
58
+ with gr.Row():
59
+ with gr.Column():
60
+ gr.Checkbox(label="character include/enable", value=True, interactive=False)
61
+ char_gallery1 = gr.Gallery(value=default_character_images, height=256, preview=True)
62
+
63
+ with gr.Row(elem_classes=["no-gap"]):
64
+ gr.Markdown("name", elem_classes=["markdown-left"], scale=3)
65
+ name_txt1 = gr.Textbox(random_names[0], elem_classes=["no-label"], scale=3)
66
+ random_name_btn1 = gr.Button("🗳️", elem_classes=["wrap", "control-button-green", "left-margin"], scale=1)
67
+
68
+ with gr.Row(elem_classes=["no-gap"]):
69
+ gr.Markdown("age", elem_classes=["markdown-left"], scale=3)
70
+ age_dd1 = gr.Dropdown(label=None, choices=ages, value=ages[0], elem_classes=["no-label"], scale=4)
71
+
72
+ with gr.Row(elem_classes=["no-gap"]):
73
+ gr.Markdown("mbti", elem_classes=["markdown-left"], scale=3)
74
+ mbti_dd1 = gr.Dropdown(label=None, choices=mbtis, value=mbtis[0], interactive=True, elem_classes=["no-label"], scale=4)
75
+
76
+ with gr.Row(elem_classes=["no-gap"]):
77
+ gr.Markdown("nature", elem_classes=["markdown-left"], scale=3)
78
+ personality_dd1 = gr.Dropdown(label=None, choices=personalities, value=personalities[0], interactive=True, elem_classes=["no-label"], scale=4)
79
+
80
+ with gr.Row(elem_classes=["no-gap"]):
81
+ gr.Markdown("job", elem_classes=["markdown-left"], scale=3)
82
+ job_dd1 = gr.Dropdown(label=None, choices=jobs["Middle Ages"], value=jobs["Middle Ages"][0], allow_custom_value=True, interactive=True, elem_classes=["no-label"], scale=4)
83
+
84
+ with gr.Row(elem_classes=["no-gap"], visible=False):
85
+ gr.Markdown("style", elem_classes=["markdown-left"], scale=3)
86
+ creative_dd1 = gr.Dropdown(choices=styles, value=styles[0], allow_custom_value=True, interactive=True, elem_classes=["no-label"], scale=4)
87
+
88
+ gen_char_btn1 = gr.Button("gen character", elem_classes=["wrap", "control-button-green"])
89
+
90
+ with gr.Column():
91
+ side_char_enable_ckb1 = gr.Checkbox(label="character include/enable", value=False)
92
+ char_gallery2 = gr.Gallery(value=default_character_images, height=256, preview=True)
93
+
94
+ with gr.Row(elem_classes=["no-gap"]):
95
+ gr.Markdown("name", elem_classes=["markdown-left"], scale=3)
96
+ name_txt2 = gr.Textbox(random_names[1], elem_classes=["no-label"], scale=3)
97
+ random_name_btn2 = gr.Button("🗳️", elem_classes=["wrap", "control-button-green", "left-margin"], scale=1)
98
+
99
+ with gr.Row(elem_classes=["no-gap"]):
100
+ gr.Markdown("age", elem_classes=["markdown-left"], scale=3)
101
+ age_dd2 = gr.Dropdown(label=None, choices=ages, value=ages[1], elem_classes=["no-label"], scale=4)
102
+
103
+ with gr.Row(elem_classes=["no-gap"]):
104
+ gr.Markdown("mbti", elem_classes=["markdown-left"], scale=3)
105
+ mbti_dd2 = gr.Dropdown(label=None, choices=mbtis, value=mbtis[1], interactive=True, elem_classes=["no-label"], scale=4)
106
+
107
+ with gr.Row(elem_classes=["no-gap"]):
108
+ gr.Markdown("nature", elem_classes=["markdown-left"], scale=3)
109
+ personality_dd2 = gr.Dropdown(label=None, choices=personalities, value=personalities[1], interactive=True, elem_classes=["no-label"], scale=4)
110
+
111
+ with gr.Row(elem_classes=["no-gap"]):
112
+ gr.Markdown("job", elem_classes=["markdown-left"], scale=3)
113
+ job_dd2 = gr.Dropdown(label=None, choices=jobs["Middle Ages"], value=jobs["Middle Ages"][1], allow_custom_value=True, interactive=True, elem_classes=["no-label"], scale=4)
114
+
115
+ with gr.Row(elem_classes=["no-gap"], visible=False):
116
+ gr.Markdown("style", elem_classes=["markdown-left"], scale=3)
117
+ creative_dd2 = gr.Dropdown(choices=styles, value=styles[0], allow_custom_value=True, interactive=True, elem_classes=["no-label"], scale=4)
118
+
119
+ gen_char_btn2 = gr.Button("gen character", elem_classes=["wrap", "control-button-green"])
120
+
121
+ with gr.Column():
122
+ side_char_enable_ckb2 = gr.Checkbox(label="character include/enable", value=False)
123
+ char_gallery3 = gr.Gallery(value=default_character_images, height=256, preview=True)
124
+
125
+ with gr.Row(elem_classes=["no-gap"]):
126
+ gr.Markdown("name", elem_classes=["markdown-left"], scale=3)
127
+ name_txt3 = gr.Textbox(random_names[2], elem_classes=["no-label"], scale=3)
128
+ random_name_btn3 = gr.Button("🗳️", elem_classes=["wrap", "control-button-green", "left-margin"], scale=1)
129
+
130
+ with gr.Row(elem_classes=["no-gap"]):
131
+ gr.Markdown("age", elem_classes=["markdown-left"], scale=3)
132
+ age_dd3 = gr.Dropdown(label=None, choices=ages, value=ages[2], elem_classes=["no-label"], scale=4)
133
+
134
+ with gr.Row(elem_classes=["no-gap"]):
135
+ gr.Markdown("mbti", elem_classes=["markdown-left"], scale=3)
136
+ mbti_dd3 = gr.Dropdown(label=None, choices=mbtis, value=mbtis[2], interactive=True, elem_classes=["no-label"], scale=4)
137
+
138
+ with gr.Row(elem_classes=["no-gap"]):
139
+ gr.Markdown("nature", elem_classes=["markdown-left"], scale=3)
140
+ personality_dd3 = gr.Dropdown(label=None, choices=personalities, value=personalities[2], interactive=True, elem_classes=["no-label"], scale=4)
141
+
142
+ with gr.Row(elem_classes=["no-gap"]):
143
+ gr.Markdown("job", elem_classes=["markdown-left"], scale=3)
144
+ job_dd3 = gr.Dropdown(label=None, choices=jobs["Middle Ages"], value=jobs["Middle Ages"][2], allow_custom_value=True, interactive=True, elem_classes=["no-label"], scale=4)
145
+
146
+ with gr.Row(elem_classes=["no-gap"], visible=False):
147
+ gr.Markdown("style", elem_classes=["markdown-left"], scale=3)
148
+ creative_dd3 = gr.Dropdown(choices=styles, value=styles[0], allow_custom_value=True, interactive=True, elem_classes=["no-label"], scale=4)
149
+
150
+ gen_char_btn3 = gr.Button("gen character", elem_classes=["wrap", "control-button-green"])
151
+
152
+ with gr.Column():
153
+ side_char_enable_ckb3 = gr.Checkbox(label="character include/enable", value=False)
154
+ char_gallery4 = gr.Gallery(value=default_character_images, height=256, preview=True)
155
+
156
+ with gr.Row(elem_classes=["no-gap"]):
157
+ gr.Markdown("name", elem_classes=["markdown-left"], scale=3)
158
+ name_txt4 = gr.Textbox(random_names[3], elem_classes=["no-label"], scale=3)
159
+ random_name_btn4 = gr.Button("🗳️", elem_classes=["wrap", "control-button-green", "left-margin"], scale=1)
160
+
161
+ with gr.Row(elem_classes=["no-gap"]):
162
+ gr.Markdown("age", elem_classes=["markdown-left"], scale=3)
163
+ age_dd4 = gr.Dropdown(label=None, choices=ages, value=ages[3], elem_classes=["no-label"], scale=4)
164
+
165
+ with gr.Row(elem_classes=["no-gap"]):
166
+ gr.Markdown("mbti", elem_classes=["markdown-left"], scale=3)
167
+ mbti_dd4 = gr.Dropdown(label=None, choices=mbtis, value=mbtis[3], interactive=True, elem_classes=["no-label"], scale=4)
168
+
169
+ with gr.Row(elem_classes=["no-gap"]):
170
+ gr.Markdown("nature", elem_classes=["markdown-left"], scale=3)
171
+ personality_dd4 = gr.Dropdown(label=None, choices=personalities, value=personalities[3], interactive=True, elem_classes=["no-label"], scale=4)
172
+
173
+ with gr.Row(elem_classes=["no-gap"]):
174
+ gr.Markdown("job", elem_classes=["markdown-left"], scale=3)
175
+ job_dd4 = gr.Dropdown(label=None, choices=jobs["Middle Ages"], value=jobs["Middle Ages"][3], allow_custom_value=True, interactive=True, elem_classes=["no-label"], scale=4)
176
+
177
+ with gr.Row(elem_classes=["no-gap"], visible=False):
178
+ gr.Markdown("style", elem_classes=["markdown-left"], scale=3)
179
+ creative_dd4 = gr.Dropdown(choices=styles, value=styles[0], allow_custom_value=True, interactive=True, elem_classes=["no-label"], scale=4)
180
+
181
+ gen_char_btn4 = gr.Button("gen character", elem_classes=["wrap", "control-button-green"])
182
+
183
+ with gr.Row():
184
+ back_to_background_setup_btn = gr.Button("← back", elem_classes=["wrap", "control-button"], scale=1)
185
+ character_setup_confirm_btn = gr.Button("generate first stories →", elem_classes=["wrap", "control-button"], scale=2)
186
+
187
+ gr.Markdown("### 💡 Plot setup", visible=False)
188
+ with gr.Accordion("generate chapter titles and each plot", open=False, visible=False) as plot_setup_section:
189
+ title = gr.Textbox("Title Undetermined Yet", elem_classes=["no-label", "font-big"])
190
+ # plot = gr.Textbox(lines=10, elem_classes=["no-label", "small-big-textarea"])
191
+
192
+ gr.Textbox("Rising action", elem_classes=["no-label"])
193
+ with gr.Row(elem_classes=["left-margin"]):
194
+ chapter1_plot = gr.Textbox(placeholder="The plot of the first chapter will be generated here", lines=3, elem_classes=["no-label"])
195
+
196
+ gr.Textbox("Crisis", elem_classes=["no-label"])
197
+ with gr.Row(elem_classes=["left-margin"]):
198
+ chapter2_plot = gr.Textbox(placeholder="The plot of the second chapter will be generated here", lines=3, elem_classes=["no-label"])
199
+
200
+ gr.Textbox("Climax", elem_classes=["no-label"])
201
+ with gr.Row(elem_classes=["left-margin"]):
202
+ chapter3_plot = gr.Textbox(placeholder="The plot of the third chapter will be generated here", lines=3, elem_classes=["no-label"])
203
+
204
+ gr.Textbox("Falling action", elem_classes=["no-label"])
205
+ with gr.Row(elem_classes=["left-margin"]):
206
+ chapter4_plot = gr.Textbox(placeholder="The plot of the fourth chapter will be generated here", lines=3, elem_classes=["no-label"])
207
+
208
+ gr.Textbox("Denouement", elem_classes=["no-label"])
209
+ with gr.Row(elem_classes=["left-margin"]):
210
+ chapter5_plot = gr.Textbox(placeholder="The plot of the fifth chapter will be generated here", lines=3, elem_classes=["no-label"])
211
+
212
+ with gr.Row():
213
+ plot_gen_temp = gr.Slider(0.0, 2.0, 1.0, step=0.1, label="temperature")
214
+ plot_gen_btn = gr.Button("gen plot", elem_classes=["control-button"])
215
+
216
+ plot_setup_confirm_btn = gr.Button("confirm", elem_classes=["control-button"])
217
+
218
+ with gr.Column(visible=False) as writing_phase:
219
+ gr.Markdown("# ✍🏼 Story writing")
220
+ gr.Markdown(desc.story_generation_phase_description, elem_classes=["markdown-justify"])
221
+
222
+ progress_comp = gr.Textbox(label=None, elem_classes=["no-label"], interactive=False)
223
+
224
+ title_display = gr.Markdown("# Title Undetermined Yet", elem_classes=["markdown-center"], visible=False)
225
+ subtitle_display = gr.Markdown("### Title Undetermined Yet", elem_classes=["markdown-center"], visible=False)
226
+
227
+ with gr.Row():
228
+ image_gen_btn = gr.Button("🏞️ Image", interactive=False, elem_classes=["control-button-green"])
229
+ audio_gen_btn = gr.Button("🔊 Audio", interactive=False, elem_classes=["control-button-green"])
230
+ img_audio_combine_btn = gr.Button("📀 Image + Audio", interactive=False, elem_classes=["control-button-green"])
231
+
232
+ story_image = gr.Image(None, visible=False, type="filepath", interactive=False, elem_classes=["no-label-image-audio"])
233
+ story_audio = gr.Audio(None, visible=False, type="filepath", interactive=False, elem_classes=["no-label-image-audio"])
234
+ story_video = gr.Video(visible=False, interactive=False, elem_classes=["no-label-gallery"])
235
+
236
+ story_progress = gr.Slider(
237
+ 1, 2, 1, step=1, interactive=True,
238
+ label="1/2", visible=False
239
+ )
240
+
241
+ story_content = gr.Textbox(
242
+ "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Integer interdum eleifend tincidunt. Vivamus dapibus, massa ut imperdiet condimentum, quam ipsum vehicula eros, a accumsan nisl metus at nisl. Nullam tortor nibh, vehicula sed tellus at, accumsan efficitur enim. Sed mollis purus vitae nisl ornare volutpat. In vitae tortor nec neque sagittis vehicula. In vestibulum velit eu lorem pulvinar dignissim. Donec eu sapien et sapien cursus pretium elementum eu urna. Proin lacinia ipsum maximus, commodo dui tempus, convallis tortor. Nulla sodales mi libero, nec eleifend eros interdum quis. Pellentesque nulla lectus, scelerisque et consequat vitae, blandit at ante. Sed nec …….",
243
+ lines=12,
244
+ elem_classes=["no-label", "small-big-textarea"]
245
+ )
246
+
247
+ action_types = gr.Radio(
248
+ choices=[
249
+ "continue current phase", "move to the next phase"
250
+ ],
251
+ value="continue current phase",
252
+ interactive=True,
253
+ elem_classes=["no-label-radio"],
254
+ visible=False,
255
+ )
256
+
257
+ with gr.Accordion("regeneration controls", open=False):
258
+ with gr.Row():
259
+ regen_actions_btn = gr.Button("Re-suggest actions", interactive=True, elem_classes=["control-button-green"])
260
+ regen_story_btn = gr.Button("Re-suggest story and actions", interactive=True, elem_classes=["control-button-green"])
261
+
262
+ custom_prompt_txt = gr.Textbox(placeholder="Re-suggest story and actions based on your own custom request", elem_classes=["no-label", "small-big-textarea"])
263
+
264
+ with gr.Row():
265
+ action_btn1 = gr.Button("Action Choice 1", interactive=False, elem_classes=["control-button-green"])
266
+ action_btn2 = gr.Button("Action Choice 2", interactive=False, elem_classes=["control-button-green"])
267
+ action_btn3 = gr.Button("Action Choice 3", interactive=False, elem_classes=["control-button-green"])
268
+
269
+ custom_action_txt = gr.Textbox(placeholder="write your own custom action", elem_classes=["no-label", "small-big-textarea"], scale=3)
270
+
271
+ with gr.Row():
272
+ restart_from_story_generation_btn = gr.Button("← back", elem_classes=["wrap", "control-button"], scale=1)
273
+ story_writing_done_btn = gr.Button("export your story →", elem_classes=["wrap", "control-button"], scale=2)
274
+
275
+ with gr.Column(visible=False) as export_phase:
276
+ gr.Markdown("### 📤 Export output")
277
+ with gr.Accordion("generate chapter titles and each plot", open=False) as export_section:
278
+ gr.Markdown("hello")
279
+
280
+ with gr.Accordion("💬", open=False, elem_id="chat-section") as chat_section:
281
+ with gr.Column(scale=1):
282
+ chatbot = gr.Chatbot(
283
+ [],
284
+ avatar_images=("assets/user.png", "assets/ai.png"),
285
+ elem_id="chatbot",
286
+ elem_classes=["no-label-chatbot"])
287
+ chat_input_txt = gr.Textbox(placeholder="enter...", interactive=True, elem_id="chat-input", elem_classes=["no-label"])
288
+
289
+ with gr.Row(elem_id="chat-buttons"):
290
+ regen_btn = gr.Button("regen", interactive=False, elem_classes=["control-button"])
291
+ clear_btn = gr.Button("clear", elem_classes=["control-button"])
292
+
293
+ pre_to_setup_btn.click(
294
+ view_change_ui.move_to_next_view,
295
+ inputs=None,
296
+ outputs=[pre_phase, background_setup_phase]
297
+ )
298
+
299
+ back_to_pre_btn.click(
300
+ view_change_ui.back_to_previous_view,
301
+ inputs=None,
302
+ outputs=[pre_phase, background_setup_phase]
303
+ )
304
+
305
+ world_setup_confirm_btn.click(
306
+ view_change_ui.move_to_next_view,
307
+ inputs=None,
308
+ outputs=[background_setup_phase, character_setup_phase]
309
+ )
310
+
311
+ back_to_background_setup_btn.click(
312
+ view_change_ui.back_to_previous_view,
313
+ inputs=None,
314
+ outputs=[background_setup_phase, character_setup_phase]
315
+ )
316
+
317
+ restart_from_story_generation_btn.click(
318
+ view_change_ui.move_to_next_view,
319
+ inputs=None,
320
+ outputs=[pre_phase, writing_phase]
321
+ )
322
+
323
+ character_setup_confirm_btn.click(
324
+ view_change_ui.move_to_next_view,
325
+ inputs=None,
326
+ outputs=[character_setup_phase, writing_phase]
327
+ ).then(
328
+ story_gen_ui.first_story_gen,
329
+ inputs=[
330
+ cursors,
331
+ genre_dd, place_dd, mood_dd,
332
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
333
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
334
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
335
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
336
+ ],
337
+ outputs=[
338
+ cursors, cur_cursor, story_content, story_progress, image_gen_btn, audio_gen_btn,
339
+ story_image, story_audio, story_video
340
+ ]
341
+ ).then(
342
+ story_gen_ui.actions_gen,
343
+ inputs=[
344
+ cursors,
345
+ genre_dd, place_dd, mood_dd,
346
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
347
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
348
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
349
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
350
+ ],
351
+ outputs=[
352
+ action_btn1, action_btn2, action_btn3, progress_comp
353
+ ]
354
+ )
355
+
356
+ regen_actions_btn.click(
357
+ story_gen_ui.actions_gen,
358
+ inputs=[
359
+ cursors,
360
+ genre_dd, place_dd, mood_dd,
361
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
362
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
363
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
364
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
365
+ ],
366
+ outputs=[
367
+ action_btn1, action_btn2, action_btn3, progress_comp
368
+ ]
369
+ )
370
+
371
+ regen_story_btn.click(
372
+ story_gen_ui.update_story_gen,
373
+ inputs=[
374
+ cursors, cur_cursor,
375
+ genre_dd, place_dd, mood_dd,
376
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
377
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
378
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
379
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
380
+ ],
381
+ outputs=[
382
+ cursors, cur_cursor, story_content, story_progress, image_gen_btn, audio_gen_btn
383
+ ]
384
+ ).then(
385
+ story_gen_ui.actions_gen,
386
+ inputs=[
387
+ cursors,
388
+ genre_dd, place_dd, mood_dd,
389
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
390
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
391
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
392
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
393
+ ],
394
+ outputs=[
395
+ action_btn1, action_btn2, action_btn3, progress_comp
396
+ ]
397
+ )
398
+
399
+ #### Setups
400
+
401
+ genre_dd.select(
402
+ ui.update_on_age,
403
+ outputs=[place_dd, mood_dd, job_dd1, job_dd2, job_dd3, job_dd4]
404
+ )
405
+
406
+ gen_char_btn1.click(
407
+ ui.gen_character_image,
408
+ inputs=[
409
+ gallery_images1, name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1, genre_dd, place_dd, mood_dd, creative_dd1],
410
+ outputs=[char_gallery1, gallery_images1]
411
+ )
412
+
413
+ gen_char_btn2.click(
414
+ ui.gen_character_image,
415
+ inputs=[gallery_images2, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2, genre_dd, place_dd, mood_dd, creative_dd2],
416
+ outputs=[char_gallery2, gallery_images2]
417
+ )
418
+
419
+ gen_char_btn3.click(
420
+ ui.gen_character_image,
421
+ inputs=[gallery_images3, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3, genre_dd, place_dd, mood_dd, creative_dd3],
422
+ outputs=[char_gallery3, gallery_images3]
423
+ )
424
+
425
+ gen_char_btn4.click(
426
+ ui.gen_character_image,
427
+ inputs=[gallery_images4, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4, genre_dd, place_dd, mood_dd, creative_dd4],
428
+ outputs=[char_gallery4, gallery_images4]
429
+ )
430
+
431
+ random_name_btn1.click(
432
+ ui.get_random_name,
433
+ inputs=[name_txt1, name_txt2, name_txt3, name_txt4],
434
+ outputs=[name_txt1],
435
+ )
436
+
437
+ random_name_btn2.click(
438
+ ui.get_random_name,
439
+ inputs=[name_txt2, name_txt1, name_txt3, name_txt4],
440
+ outputs=[name_txt2],
441
+ )
442
+
443
+ random_name_btn3.click(
444
+ ui.get_random_name,
445
+ inputs=[name_txt3, name_txt1, name_txt2, name_txt4],
446
+ outputs=[name_txt3],
447
+ )
448
+
449
+ random_name_btn4.click(
450
+ ui.get_random_name,
451
+ inputs=[name_txt4, name_txt1, name_txt2, name_txt3],
452
+ outputs=[name_txt4],
453
+ )
454
+
455
+ ### Story generation
456
+ story_content.input(
457
+ story_gen_ui.update_story_content,
458
+ inputs=[story_content, cursors, cur_cursor],
459
+ outputs=[cursors],
460
+ )
461
+
462
+ image_gen_btn.click(
463
+ story_gen_ui.image_gen,
464
+ inputs=[
465
+ genre_dd, place_dd, mood_dd, title, story_content, cursors, cur_cursor, story_audio
466
+ ],
467
+ outputs=[
468
+ story_image, img_audio_combine_btn, cursors, progress_comp,
469
+ ]
470
+ )
471
+
472
+ audio_gen_btn.click(
473
+ story_gen_ui.audio_gen,
474
+ inputs=[
475
+ genre_dd, place_dd, mood_dd, title, story_content, cursors, cur_cursor, story_image
476
+ ],
477
+ outputs=[story_audio, img_audio_combine_btn, cursors, progress_comp]
478
+ )
479
+
480
+ img_audio_combine_btn.click(
481
+ story_gen_ui.video_gen,
482
+ inputs=[
483
+ story_image, story_audio, story_content, cursors, cur_cursor
484
+ ],
485
+ outputs=[
486
+ story_image, story_audio, story_video, cursors, progress_comp
487
+ ],
488
+ )
489
+
490
+ story_progress.input(
491
+ story_gen_ui.move_story_cursor,
492
+ inputs=[
493
+ story_progress, cursors
494
+ ],
495
+ outputs=[
496
+ cur_cursor,
497
+ story_progress,
498
+ story_content,
499
+ story_image, story_audio, story_video,
500
+ action_btn1, action_btn2, action_btn3,
501
+ ]
502
+ )
503
+
504
+ action_btn1.click(
505
+ lambda: (gr.update(interactive=False), gr.update(interactive=False), gr.update(interactive=False)),
506
+ inputs=None,
507
+ outputs=[
508
+ image_gen_btn, audio_gen_btn, img_audio_combine_btn
509
+ ]
510
+ ).then(
511
+ story_gen_ui.next_story_gen,
512
+ inputs=[
513
+ cursors,
514
+ action_btn1,
515
+ genre_dd, place_dd, mood_dd,
516
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
517
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
518
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
519
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
520
+ ],
521
+ outputs=[
522
+ cursors, cur_cursor,
523
+ story_content, story_progress,
524
+ image_gen_btn, audio_gen_btn,
525
+ story_image, story_audio, story_video
526
+ ]
527
+ ).then(
528
+ story_gen_ui.actions_gen,
529
+ inputs=[
530
+ cursors,
531
+ genre_dd, place_dd, mood_dd,
532
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
533
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
534
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
535
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
536
+ ],
537
+ outputs=[
538
+ action_btn1, action_btn2, action_btn3, progress_comp
539
+ ]
540
+ )
541
+
542
+ action_btn2.click(
543
+ lambda: (gr.update(interactive=False), gr.update(interactive=False), gr.update(interactive=False)),
544
+ inputs=None,
545
+ outputs=[
546
+ image_gen_btn, audio_gen_btn, img_audio_combine_btn
547
+ ]
548
+ ).then(
549
+ story_gen_ui.next_story_gen,
550
+ inputs=[
551
+ cursors,
552
+ action_btn2,
553
+ genre_dd, place_dd, mood_dd,
554
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
555
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
556
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
557
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
558
+ ],
559
+ outputs=[
560
+ cursors, cur_cursor,
561
+ story_content, story_progress,
562
+ image_gen_btn, audio_gen_btn,
563
+ story_image, story_audio, story_video
564
+ ]
565
+ ).then(
566
+ story_gen_ui.actions_gen,
567
+ inputs=[
568
+ cursors,
569
+ genre_dd, place_dd, mood_dd,
570
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
571
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
572
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
573
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
574
+ ],
575
+ outputs=[
576
+ action_btn1, action_btn2, action_btn3, progress_comp
577
+ ]
578
+ )
579
+
580
+ action_btn3.click(
581
+ lambda: (gr.update(interactive=False), gr.update(interactive=False), gr.update(interactive=False)),
582
+ inputs=None,
583
+ outputs=[
584
+ image_gen_btn, audio_gen_btn, img_audio_combine_btn
585
+ ]
586
+ ).then(
587
+ story_gen_ui.next_story_gen,
588
+ inputs=[
589
+ cursors,
590
+ action_btn3,
591
+ genre_dd, place_dd, mood_dd,
592
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
593
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
594
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
595
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
596
+ ],
597
+ outputs=[
598
+ cursors, cur_cursor,
599
+ story_content, story_progress,
600
+ image_gen_btn, audio_gen_btn,
601
+ story_image, story_audio, story_video
602
+ ]
603
+ ).then(
604
+ story_gen_ui.actions_gen,
605
+ inputs=[
606
+ cursors,
607
+ genre_dd, place_dd, mood_dd,
608
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
609
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
610
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
611
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
612
+ ],
613
+ outputs=[
614
+ action_btn1, action_btn2, action_btn3, progress_comp
615
+ ]
616
+ )
617
+
618
+ custom_action_txt.submit(
619
+ lambda: (gr.update(interactive=False), gr.update(interactive=False), gr.update(interactive=False)),
620
+ inputs=None,
621
+ outputs=[
622
+ image_gen_btn, audio_gen_btn, img_audio_combine_btn
623
+ ]
624
+ ).then(
625
+ story_gen_ui.next_story_gen,
626
+ inputs=[
627
+ cursors,
628
+ custom_action_txt,
629
+ genre_dd, place_dd, mood_dd,
630
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
631
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
632
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
633
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
634
+ ],
635
+ outputs=[
636
+ cursors, cur_cursor,
637
+ story_content, story_progress,
638
+ image_gen_btn, audio_gen_btn,
639
+ story_image, story_audio, story_video
640
+ ]
641
+ ).then(
642
+ story_gen_ui.actions_gen,
643
+ inputs=[
644
+ cursors,
645
+ genre_dd, place_dd, mood_dd,
646
+ name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
647
+ side_char_enable_ckb1, name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
648
+ side_char_enable_ckb2, name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
649
+ side_char_enable_ckb3, name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
650
+ ],
651
+ outputs=[
652
+ action_btn1, action_btn2, action_btn3, progress_comp
653
+ ]
654
+ )
655
+
656
+ ### Chatbot
657
+
658
+ # chat_input_txt.submit(
659
+ # chat_ui.chat,
660
+ # inputs=[
661
+ # chat_input_txt, chat_mode, chat_state,
662
+ # genre_dd, place_dd, mood_dd,
663
+ # name_txt1, age_dd1, mbti_dd1, personality_dd1, job_dd1,
664
+ # name_txt2, age_dd2, mbti_dd2, personality_dd2, job_dd2,
665
+ # name_txt3, age_dd3, mbti_dd3, personality_dd3, job_dd3,
666
+ # name_txt4, age_dd4, mbti_dd4, personality_dd4, job_dd4,
667
+ # chapter1_title, chapter2_title, chapter3_title, chapter4_title,
668
+ # chapter1_plot, chapter2_plot, chapter3_plot, chapter4_plot
669
+ # ],
670
+ # outputs=[chat_input_txt, chat_state, chatbot, regen_btn]
671
+ # )
672
+
673
+ regen_btn.click(
674
+ chat_ui.rollback_last_ui,
675
+ inputs=[chatbot], outputs=[chatbot]
676
+ ).then(
677
+ chat_ui.chat_regen,
678
+ inputs=[chat_mode, chat_state],
679
+ outputs=[chat_state, chatbot]
680
+ )
681
+
682
+ clear_btn.click(
683
+ chat_ui.chat_reset,
684
+ inputs=[chat_mode, chat_state],
685
+ outputs=[chat_input_txt, chat_state, chatbot, regen_btn]
686
+ )
687
 
688
+ demo.queue().launch(share=True)
 
assets/.gitattributes ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ image.png filter=lfs diff=lfs merge=lfs -text
2
+ nsfw_warning.png filter=lfs diff=lfs merge=lfs -text
3
+ nsfw_warning_wide.png filter=lfs diff=lfs merge=lfs -text
4
+ overview.png filter=lfs diff=lfs merge=lfs -text
5
+ user.png filter=lfs diff=lfs merge=lfs -text
6
+ ai.png filter=lfs diff=lfs merge=lfs -text
7
+ background.png filter=lfs diff=lfs merge=lfs -text
assets/Lugrasimo-Regular.ttf ADDED
Binary file (32.5 kB). View file
 
assets/ai.png ADDED

Git LFS Details

  • SHA256: c2ecc5c89b6b211ddb5e74aa4f786bb443ba4ac75a920169ea13e13ac98a2a42
  • Pointer size: 130 Bytes
  • Size of remote file: 25.8 kB
assets/background.png ADDED

Git LFS Details

  • SHA256: d02c95b8346de8a198fed9242adc36368d1d812dfd7ee5eea5606e088218fe13
  • Pointer size: 131 Bytes
  • Size of remote file: 998 kB
assets/image.png ADDED

Git LFS Details

  • SHA256: 0ea20c8cb475714feef87ae8f6f24f475646cfb1ab921aa3e69220a3966bea63
  • Pointer size: 131 Bytes
  • Size of remote file: 399 kB
assets/nsfw_warning.png ADDED

Git LFS Details

  • SHA256: 7367d878fc32346293d3b6f50ced1481aeb29a438a318a3821ac26ae99eca284
  • Pointer size: 131 Bytes
  • Size of remote file: 455 kB
assets/nsfw_warning_wide.png ADDED

Git LFS Details

  • SHA256: f9aecfbca691b816bfa3ee0e80e38e3b6157ed9d513b821bb40d196833ec1be1
  • Pointer size: 131 Bytes
  • Size of remote file: 438 kB
assets/overview.png ADDED

Git LFS Details

  • SHA256: 15d5acbd4593eee43653544e2365cc81ac6ada2ab250c9d30660a95bee70687d
  • Pointer size: 132 Bytes
  • Size of remote file: 2.67 MB
assets/palm_prompts.toml ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [image_gen]
2
+ neg_prompt="nsfw, worst quality, low quality, lowres, bad anatomy, bad hands, text, watermark, signature, error, missing fingers, extra digit, fewer digits, cropped, worst quality, normal quality, blurry, username, extra limbs, twins, boring, jpeg artifacts"
3
+
4
+ [image_gen.character]
5
+ gen_prompt = """Based on my brief descriptions of the character, suggest a "primary descriptive sentence" and "concise descriptors" to visualize them. Ensure you consider elements like the character's gender, age, appearance, occupation, clothing, posture, facial expression, mood, among others.
6
+ Once complete, please output only a single "primary descriptive sentence" and the "concise descriptors" in a syntactically valid JSON format.
7
+ The output template is as follows: {{"primary_sentence":"primary descriptive sentence","descriptors":["concise descriptor 1","concise descriptor 2","concise descriptor 3"]}}.
8
+ To enhance the quality of your character's description or expression, you might consider drawing from the following categories:
9
+ - Emotions and Expressions: "ecstatic", "melancholic", "furious", "startled", "bewildered", "pensive", "overjoyed", "crushed", "elated", "panicked", "satisfied", "cynical", "apathetic", "delighted", "terrified", "desperate", "triumphant", "mortified", "envious", "appreciative", "blissful", "heartbroken", "livid", "astounded", "baffled", "smiling", "frowning", "grinning", "crying", "pouting", "glaring", "blinking", "winking", "smirking", "whistling".
10
+ - Physical Features: "upper body", "very long hair", "looking at viewer", "looking to the side", "looking at another", "thick lips", "skin spots", "acnes", "skin blemishes", "age spot", "perfect eyes", "detailed eyes", "realistic eyes", "dynamic standing", "beautiful face", "necklace", "high detailed skin", "hair ornament", "blush", "shiny skin", "long sleeves", "cleavage", "rubber suit", "slim", "plump", "muscular", "pale skin", "tan skin", "dark skin", "blonde hair", "brunette hair", "black hair", "blue eyes", "green eyes", "brown eyes", "curly hair", "short hair", "wavy hair".
11
+ - Visual Enhancements: "masterpiece", "cinematic lighting", "detailed lighting", "tyndall effect", "soft lighting", "volumetric lighting", "close up", "wide shot", "glossy", "beautiful lighting", "warm lighting", "extreme", "ultimate", "best", "supreme", "ultra", "intense", "powerful", "exceptional", "remarkable", "strong", "vigorous", "dynamic angle", "front view person", "bangs", "waist up", "bokeh".
12
+ - Age and Gender: "1boy", "1man", "1male", "1girl", "1woman", "1female", "teen", "teenage", "twenties", "thirties", "forties", "fifties", "middle-age".
13
+ Do note that this list isn't exhaustive, and you're encouraged to suggest similar terms not included here.
14
+ Exclude words from the suggestion that are redundant or have conflicting meanings.
15
+ Especially, Exclude words that conflict with the meaning of "main_sentence".
16
+ Do not output anything other than JSON values.
17
+ Do not provide any additional explanation of the following.
18
+ Only JSON is allowed.
19
+ ===
20
+ This is some examples.
21
+ Q:
22
+ The character's name is Liam, their job is as the Secret Agent, and they are in their 50s. And the keywords that help in associating with the character are "Thriller, Underground Warehouse, Darkness, ESTP, Ambitious, Generous".
23
+ Print out no more than 45 words in syntactically valid JSON format.
24
+ A:
25
+ {{"primary_sentence":"Middle-aged man pointing a gun in an underground warehouse","descriptors":["1man","solo","masterpiece","best quality","upper body","black suit","pistol in hand","dramatic lighting","muscular physique","intense brown eyes","raven-black hair","stylish cut","determined gaze","looking at viewer","stealthy demeanor","cunning strategist","advanced techwear","sleek","night operative","shadowy figure","night atmosphere","mysterious aura","highly detailed","film grain","detailed eyes and face"]}}
26
+
27
+ Q:
28
+ The character's name is Catherine, their job is as the Traveler, and they are in their 10s. And the keywords that help in associating with the character are "Romance, Starlit Bridge, Dreamy, ENTJ, Ambitious".
29
+ Print out no more than 45 words in syntactically valid JSON format.
30
+ A:
31
+ {{"primary_sentence":"A dreamy teenage girl standing on a starlit bridge with romantic ambitions","descriptors":["1girl","solo","masterpiece","best quality","upper body","flowing skirt","sun hat","bright-eyed","map in hand","ethereal beauty","wanderlust","scarf","whimsical","graceful poise","celestial allure","close-up","warm soft lighting","luminescent glow","gentle aura","mystic charm","smirk","dreamy landscape","poetic demeanor","cinematic lighting","extremely detailed","film grain","detailed eyes and face"]}}
32
+
33
+ Q:
34
+ The character's name is Claire, their job is as the Technological Advancement, and they are in their 20s. And the keywords that help in associating with the character are "Science Fiction, Space Station, INFP, Ambitious, Generous".
35
+ Print out no more than 45 words in syntactically valid JSON format.
36
+ A:
37
+ {{"primary_sentence":"A young ambitious woman tech expert aboard a futuristic space station","descriptors":["1girl","solo","masterpiece","best quality","upper body","sleek silver jumpsuit","futuristic heels","contemplative","editorial portrait","dynamic angle","sci-fi","techno-savvy","sharp focus","bokeh","beautiful lighting","intricate circuitry","robotic grace","rich colors","vivid contrasts","dramatic lighting","futuristic flair","avant-garde","high-tech allure","innovative mind","mechanical sophistication","film grain","detailed eyes and face"]}}
38
+
39
+ Q:
40
+ The character's name is Sophie, their job is as a Ballet Dancer, and they are in their 10s. And the keywords that help in associating with the character are "Grace, Dance Studio, Elegance, ISFJ, Gentle, Passionate"
41
+ Print out no more than 45 words in syntactically valid JSON format.
42
+ A:
43
+ {{"primary_sentence":"An elegant dancer poses gracefully in a mirrored studio","descriptors":["1girl","teen","solo","masterpiece","best quality","upper body","beautiful face","shiny skin","wavy hair","ballet attire","tiptoe stance","flowing skirt","focused gaze","soft ambiance","soft lighting","film grain","detailed eyes and face"]}}
44
+ ===
45
+ This is my request.
46
+ Q:
47
+ {input}
48
+ A:
49
+ """
50
+ query = """
51
+ The character's name is {character_name}, their job is as the {job}, and they are in their {age}. And the keywords that help in associating with the character are "{keywords}".
52
+ Print out no more than 45 words in syntactically valid JSON format.
53
+ """
54
+
55
+ [image_gen.background]
56
+ gen_prompt = """Based on my brief descriptions of the scene, suggest a "primary descriptive sentence" and "concise descriptors" to visualize it. Ensure you consider elements like the setting's time of day, atmosphere, prominent objects, mood, location, natural phenomena, architecture, among others.
57
+ Once complete, please output only a single "primary descriptive sentence" and the "concise descriptors" in a syntactically valid JSON format.
58
+ The output template is as follows: {{"primary_sentence":"primary descriptive sentence","descriptors":["concise descriptor 1","concise descriptor 2","concise descriptor 3"]}}.
59
+ To enhance the quality of your scene's description or expression, you might consider drawing from the following categories:
60
+ - Atmosphere and Time: "dawn", "dusk", "midday", "midnight", "sunset", "sunrise", "foggy", "misty", "stormy", "calm", "clear night", "starlit", "moonlit", "golden hour".
61
+ - Natural Phenomena: "rainbow", "thunderstorm", "snowfall", "aurora borealis", "shooting star", "rain shower", "windy", "sunny".
62
+ - Location and Architecture: "urban", "rural", "mountainous", "oceanfront", "forest", "desert", "island", "modern city", "ancient ruins", "castle", "village", "meadow", "cave", "bridge".
63
+ - Prominent Objects: "giant tree", "waterfall", "stream", "rock formation", "ancient artifact", "bonfire", "tent", "vehicle", "statue", "fountain".
64
+ - Visual Enhancements: "masterpiece", "cinematic lighting", "detailed lighting", "soft lighting", "volumetric lighting", "tyndall effect", "warm lighting", "close up", "wide shot", "beautiful perspective", "bokeh".
65
+ Do note that this list isn't exhaustive, and you're encouraged to suggest similar terms not included here.
66
+ Exclude words from the suggestion that are redundant or have conflicting meanings.
67
+ Especially, Exclude words that conflict with the meaning of "main_sentence".
68
+ Do not output anything other than JSON values.
69
+ Do not provide any additional explanation of the following.
70
+ Only JSON is allowed.
71
+ ===
72
+ This is some examples.
73
+ Q:
74
+ The genre is "Fantasy", the place is "Enchanted Forest", the mood is "Mystical", the title of the novel is "Whispering Leaves", and the chapter plot revolves around "A hidden glade where elves sing under the moonlight".
75
+ Print out no more than 45 words in syntactically valid JSON format.
76
+ A:
77
+ {{"main_sentence":"a mystical glade in an enchanted forest where elves sing beneath the moonlight","descriptors":["no humans","masterpiece","fantasy","enchanted forest","moonlit glade","mystical atmosphere","singing elves","luminous fireflies","ancient trees","shimmering leaves","whispering winds","hidden secrets","elven magic","masterpiece","soft lighting","silver glow","detailed shadows","enchanted mood","highly detailed","film grain"]}}
78
+
79
+ Q:
80
+ The genre is "Science Fiction", the place is "Galactic Space Station", the mood is "Tense", the title of the novel is "Stars Unbound", and the chapter plot revolves around "Ambassadors from different galaxies discussing a new treaty".
81
+ Print out no more than 45 words in syntactically valid JSON format.
82
+ A:
83
+ {{"main_sentence":"a tense gathering in a galactic space station where interstellar ambassadors negotiate","descriptors":["no humans","masterpiece","science fiction","galactic space station","star-studded backdrop","advanced technology","diverse aliens","hovering spacecrafts","futuristic architecture","tense discussions","interstellar politics","neon lights","holographic displays","masterpiece","detailed lighting","cinematic mood","highly detailed","film grain"]}}
84
+
85
+ Q:
86
+ The genre is "Romance", the place is "Beach", the mood is "Heartfelt", the title of the novel is "Waves of Passion", and the chapter plot revolves around "Two lovers reconciling their differences by the shore".
87
+ Print out no more than 45 words in syntactically valid JSON format.
88
+ A:
89
+ {{"main_sentence":"a heartfelt scene on a beach during sunset where two lovers reconcile","descriptors":["no humans","masterpiece","romance","beach","sunset horizon","golden sands","lapping waves","embrace","teary-eyed confessions","seashells","reflective waters","warm hues","silhouette of lovers","soft breeze","beautiful perspective","detailed shadows","emotional atmosphere","highly detailed","film grain"]}}
90
+
91
+ Q:
92
+ The genre is "Middle Ages", the place is "Royal Palace", the mood is "Epic Adventure", the title of the novel is "Throne of Fates", and the chapter plot revolves around "A brave knight receiving a quest from the king".
93
+ Print out no more than 45 words in syntactically valid JSON format.
94
+ A:
95
+ {{"main_sentence":"an epic scene in a royal palace where a knight is tasked with a quest by the king","descriptors":["no humans","masterpiece","middle ages","royal palace","castle","grand throne room","golden hour","armored knight","majestic king","tapestries","stone walls","torches","glistening armor","banner flags","medieval atmosphere","heroic demeanor","detailed architecture","golden crowns","highly detailed","film grain"]}}
96
+ ===
97
+ This is my request.
98
+ Q:
99
+ {input}
100
+ A:
101
+ """
102
+ query = """
103
+ The genre is "{genre}", the place is "{place}", the mood is "{mood}", the title of the novel is "{title}", and the chapter plot revolves around "{chapter_plot}".
104
+ Print out no more than 45 words in syntactically valid JSON format.
105
+ """
106
+
107
+ [music_gen]
108
+ gen_prompt = """Based on my brief descriptions of the novel's mood, theme, or setting, suggest a "primary descriptive sentence" to conceptualize the musical piece. Ensure you consider elements like the music's genre, BPM, primary instruments, emotions evoked, era (if applicable), and other relevant musical characteristics.
109
+ Once complete, please output only a single "primary descriptive sentence" in a syntactically valid JSON format.
110
+ The output template is as follows:
111
+ {{"primary_sentence":"primary descriptive sentence"}}.
112
+ To enhance the quality of your music's description or expression, you might consider drawing from the following categories:
113
+ - Musical Genre and Era: "80s", "90s", "classical", "jazz", "EDM", "rock", "folk", "baroque", "bebop", "grunge", "funk", "hip-hop", "blues", "country".
114
+ - BPM and Rhythm: "slow-paced", "mid-tempo", "upbeat", "rhythmic", "syncopated", "steady beat", "dynamic tempo".
115
+ - Primary Instruments and Sound: "guitar", "synth", "piano", "saxophone", "drums", "violin", "flute", "bassy", "treble-heavy", "distorted", "acoustic", "electric", "ambient sounds".
116
+ - Emotions and Atmosphere: "nostalgic", "energetic", "melancholic", "uplifting", "dark", "light-hearted", "intense", "relaxing", "haunting", "joyful", "sombre", "celebratory", "mystical".
117
+ - Musical Techniques and Enhancements: "harmonious", "dissonant", "layered", "minimalistic", "rich textures", "simple melody", "complex rhythms", "vocal harmonies", "instrumental solo".
118
+ Do note that this list isn't exhaustive, and you're encouraged to suggest similar terms not included here.
119
+ Exclude words from the suggestion that are redundant or have conflicting meanings.
120
+ Especially, Exclude words that conflict with the meaning of "primary_sentence".
121
+ Do not output anything other than JSON values.
122
+ Do not provide any additional explanation of the following.
123
+ Only JSON is allowed.
124
+ ===
125
+ This is some examples.
126
+ Q:
127
+ The genre is "Fantasy", the place is "Enchanted Forest", the mood is "Mystical", the title of the novel is "Whispering Leaves", and the chapter plot revolves around "A hidden glade where elves sing under the moonlight".
128
+ A:
129
+ {{"main_sentence":"a gentle folk melody filled with whimsical flutes, echoing harps, and distant ethereal vocals, capturing the enchantment of a moonlit forest and the mystique of singing elves"}}
130
+
131
+ Q:
132
+ The genre is "Science Fiction", the place is "Galactic Space Station", the mood is "Tense", the title of the novel is "Stars Unbound", and the chapter plot revolves around "Ambassadors from different galaxies discussing a new treaty".
133
+ A:
134
+ {{"main_sentence":"an ambient electronic track, with pulsating synths, spacey reverberations, and occasional digital glitches, reflecting the vastness of space and the tension of intergalactic diplomacy"}}
135
+
136
+ Q:
137
+ The genre is "Romance", the place is "Beach", the mood is "Heartfelt", the title of the novel is "Waves of Passion", and the chapter plot revolves around "Two lovers reconciling their differences by the shore".
138
+ A:
139
+ {{"main_sentence":"a soft acoustic ballad featuring soulful guitars, delicate percussion, and heartfelt vocals, evoking feelings of love, reconciliation, and the gentle ebb and flow of the ocean waves"}}
140
+
141
+ Q:
142
+ The genre is "Middle Ages", the place is "Royal Palace", the mood is "Epic Adventure", the title of the novel is "Throne of Fates", and the chapter plot revolves around "A brave knight receiving a quest from the king".
143
+ A:
144
+ {{"main_sentence":"a grand orchestral piece, dominated by powerful brass, rhythmic drums, and soaring strings, portraying the valor of knights, the majesty of royalty, and the anticipation of an epic quest"}}
145
+ ===
146
+ This is my request.
147
+ Q:
148
+ {input}
149
+ A:
150
+ """
151
+ query = """
152
+ The genre is "{genre}", the place is "{place}", the mood is "{mood}", the title of the novel is "{title}", and the chapter plot revolves around "{chapter_plot}".
153
+ Print out only one main_sentence in syntactically valid JSON format.
154
+ """
assets/recording.mp4 ADDED
Binary file (141 kB). View file
 
assets/user.png ADDED

Git LFS Details

  • SHA256: 4bb4ec64c526d55185780954aeebe68bc902cba0b58db29b318a128122206c72
  • Pointer size: 130 Bytes
  • Size of remote file: 37.4 kB
constants/__init__.py ADDED
File without changes
constants/css.py ADDED
@@ -0,0 +1,186 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ STYLE = """
2
+ .main {
3
+ width: 75% !important;
4
+ margin: auto;
5
+ }
6
+
7
+ .ninty-five-width {
8
+ width: 95% !important;
9
+ margin: auto;
10
+ }
11
+
12
+ .center-label > label > span {
13
+ display: block !important;
14
+ text-align: center;
15
+ }
16
+
17
+ .no-label {
18
+ padding: 0px !important;
19
+ }
20
+
21
+ .no-label > label > span {
22
+ display: none;
23
+ }
24
+
25
+ .wrap {
26
+ min-width: 0px !important;
27
+ }
28
+
29
+ .markdown-center {
30
+ text-align: center;
31
+ }
32
+
33
+ .markdown-justify {
34
+ text-align: justify !important;
35
+ }
36
+
37
+ .markdown-left {
38
+ text-align: left;
39
+ }
40
+
41
+ .markdown-left > div:nth-child(2) {
42
+ padding-top: 10px !important;
43
+ }
44
+
45
+ .markdown-center > div:nth-child(2) {
46
+ padding-top: 10px;
47
+ }
48
+
49
+ .no-gap {
50
+ flex-wrap: initial !important;
51
+ gap: initial !important;
52
+ }
53
+
54
+ .no-width {
55
+ min-width: 0px !important;
56
+ }
57
+
58
+ .icon-buttons {
59
+ display: none !important;
60
+ }
61
+
62
+ .title-width {
63
+ display: content !important;
64
+ }
65
+
66
+ .left-margin {
67
+ padding-left: 50px;
68
+ background-color: transparent;
69
+ border: none;
70
+ }
71
+
72
+ .no-border > div:nth-child(1){
73
+ border: none;
74
+ background: transparent;
75
+ }
76
+
77
+ textarea {
78
+ border: none !important;
79
+ border-radius: 0px !important;
80
+ --block-background-fill: transparent !important;
81
+ }
82
+
83
+ #chatbot {
84
+ height: 800px !important;
85
+ box-shadow: 6px 5px 10px 1px rgba(255, 221, 71, 0.15);
86
+ border-color: beige;
87
+ border-width: 2px;
88
+ }
89
+
90
+ #chatbot .wrapper {
91
+ height: 660px;
92
+ }
93
+
94
+ .small-big-textarea > label > textarea {
95
+ font-size: 12pt !important;
96
+ }
97
+
98
+ .control-button {
99
+ background: none !important;
100
+ border-color: #69ade2 !important;
101
+ border-width: 2px !important;
102
+ color: #69ade2 !important;
103
+ }
104
+
105
+ .control-button-green {
106
+ background: none !important;
107
+ border-color: #51ad00 !important;
108
+ border-width: 2px !important;
109
+ color: #51ad00 !important;
110
+ }
111
+
112
+ .small-big {
113
+ font-size: 15pt !important;
114
+ }
115
+
116
+ .no-label-chatbot > div > div:nth-child(1) {
117
+ display: none;
118
+ }
119
+
120
+ #chat-section {
121
+ position: fixed;
122
+ align-self: end;
123
+ width: 65%;
124
+ z-index: 10000;
125
+ border: none !important;
126
+ background: none;
127
+ padding-left: 0px;
128
+ padding-right: 0px;
129
+ }
130
+
131
+ #chat-section > div:nth-child(3) {
132
+ # background: white;
133
+ }
134
+
135
+ #chat-section .form {
136
+ position: relative !important;
137
+ bottom: 130px;
138
+ width: 90%;
139
+ margin: auto;
140
+ border-radius: 20px;
141
+ }
142
+
143
+ #chat-section .icon {
144
+ display: none;
145
+ }
146
+
147
+ #chat-section .label-wrap {
148
+ text-align: right;
149
+ display: block;
150
+ }
151
+
152
+ #chat-section .label-wrap span {
153
+ font-size: 30px;
154
+ }
155
+
156
+ #chat-buttons {
157
+ position: relative !important;
158
+ bottom: 130px;
159
+ width: 90%;
160
+ margin: auto;
161
+ }
162
+
163
+ @media only screen and (max-width: 500px) {
164
+ .main {
165
+ width: 100% !important;
166
+ margin: auto;
167
+ }
168
+
169
+ #chat-section {
170
+ width: 95%;
171
+ }
172
+ }
173
+
174
+ .font-big textarea {
175
+ font-size: 19pt !important;
176
+ text-align: center;
177
+ }
178
+
179
+ .no-label-image-audio > div:nth-child(2) {
180
+ display: none;
181
+ }
182
+
183
+ .no-label-radio > span {
184
+ display: none;
185
+ }
186
+ """
constants/desc.py ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ pre_phase_description = """
2
+ Zero2Story is a framework built on top of [PaLM API](https://developers.generativeai.google), [Stable Diffusion](https://en.wikipedia.org/wiki/Stable_Diffusion), [MusicGen](https://audiocraft.metademolab.com/musicgen.html) for ordinary people to create their own stories. This framework consists of the **background setup**, **character setup**, and **interative story generation** phases.
3
+ """
4
+
5
+ background_setup_phase_description = """
6
+ In this phase, users can setup the genre, place, and mood of the story. Especially, genre is the key that others are depending on.
7
+ """
8
+ character_setup_phase_description = """
9
+ In this phase, users can setup characters up to four. For each character, users can decide their characteristics and basic information such as name, age, MBTI, and personality. Also, the image of each character could be generated based on the information using Stable Diffusion.
10
+
11
+ PaLM API translates the given character information into a list of keywords that Stable Diffusion could effectively understands. Then, Stable Diffusion generates images using the keywords as a prompt.
12
+ """
13
+ story_generation_phase_description = """
14
+ In this phase, the first few paragraphs are generated solely based on the information from the background and character setup phases. Afterwards, users could choose a direction from the given three options that PaLM API generated. Then, further stories are generated based on users' choice. This cycle of choosing an option and generating further stories are interatively continued until users decides to stop.
15
+
16
+ In each story generation, users also could generate background images and music that describe each scene using Stable Diffusion and MusicGen. If the generated story, options, image, and music in each turn, users could ask to re-generate them.
17
+ """
constants/init_values.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ genres = ["Middle Ages", "Cyberpunk", "Science Fiction", "Horror", "Romance", "Mystery", "Thriller", "Survival", "Post-apocalyptic", "Historical Fiction"]
2
+
3
+ places = {
4
+ "Middle Ages": ["Royal Palace", "Small Village", "Enchanted Forest", "Church", "City Walls and Beyond", "Wizard's Tower", "Inn", "Battlefield", "Grand Library", "Royal Gardens"],
5
+ "Cyberpunk": ["Neon-lit City Streets", "Underground Bar", "Rave Club", "Tech Market", "Hacker Lounge", "Metropolis Central", "Virtual Reality Hub", "Flying Car Docking Station", "Illegal Cybernetic Clinic", "Information Trade Point"],
6
+ "Science Fiction": ["Space Station", "Futuristic City", "Alien Planet", "Hidden Moon Base", "Cybernetic Hub", "Galactic Headquarters", "Robotics Factory", "Intergalactic Trading Post", "Alien Cultural Center", "Virtual Reality Realm"],
7
+ "Horror": ["Abandoned House", "Cemetery", "Mental Hospital", "Cathedral", "Forest", "Museum", "Basement", "Abandoned Theme Park", "Abandoned School", "Dark Alley"],
8
+ "Romance": ["Beach", "Library", "Starlit Bridge", "Lake", "Flower Shop", "Candlelit Restaurant", "Garden", "Cobblestone Alley", "Windy Road", "Ocean View Deck"],
9
+ "Mystery": ["Haunted House", "Ancient Castle", "Secret Lab", "Dark City Alleyways", "Underground Laboratory", "Historic Art Museum", "Antique Library", "Mythical Ruins", "Modern City Skyscraper", "Deserted Island"],
10
+ "Thriller": ["Labyrinth", "Abandoned Hospital", "Downtown Alleyway", "Locked Room", "Basement", "Cabin in the Woods", "Abandoned Amusement Park", "Police Station", "Underground Warehouse", "Secret Research Lab"],
11
+ "Survival": ["Desert", "Forest", "Glacier", "Urban Ruins", "Underwater", "Island", "Mountain Range", "Stormy Ocean", "Wasteland", "Jungle"],
12
+ "Post-apocalyptic": ["Abandoned City", "Underground Bunker", "Desert Wastelands", "Radioactive Zones", "Ruined Metropolis", "Overgrown Countryside", "Fortified Community", "Lost Library", "Strategic Bridge", "Ghost Town"],
13
+ "Historical Fiction": ["Castle", "Ancient City", "Countryside", "Temple", "Town Square", "Expedition Base", "Fortress", "Royal Court", "Medieval Market", "Training Ground"]
14
+ }
15
+
16
+ moods = {
17
+ "Middle Ages": ["Epic Adventure", "Deep Romance", "Intense Tension", "Mystical and Magical", "Honor and Principle", "Pain and Despair", "Danger and Peril", "Grand Feast and Court Life", "Hope in Darkness", "Traditional National and Cultural"],
18
+ "Cyberpunk": ["Neon Nights", "Rain-soaked Ambiance", "Electric Energy", "Holographic Illusions", "Cyber Rhythm", "Dark Alley Mysteries", "High-speed Chase", "Augmented Reality Fashion", "Tech-induced Uncertainty", "Tranquility amidst Chaos"],
19
+ "Science Fiction": ["Technological Advancement", "First Contact", "Galactic Warfare", "Deep Space Exploration", "Intergalactic Romance", "Survival in Space", "Political Intrigue", "Covert Operations", "Interstellar Festival", "Technological Dystopia"],
20
+ "Horror": ["Ominous", "Mysterious", "Brutal", "Supernatural", "Intense", "Unexpected", "Silent Horror", "Confusing", "Insanity", "Atmospheric Horror"],
21
+ "Romance": ["Poetic", "Dreamy", "Heartfelt", "Cheerful", "Melancholic", "Innocent", "Exhilarating", "Sweet", "Cozy", "Sunlit"],
22
+ "Mystery": ["Dark and Gritty", "Silent Suspense", "Time-sensitive Thrill", "Unpredictable Twist", "Momentary Peace", "Unknown Anxiety", "Suspicion and Uncertainty", "Unsettling Atmosphere", "Shocking Revelation", "Loneliness and Isolation"],
23
+ "Thriller": ["Uneasiness", "Suspicion", "Tension", "Anxiety", "Chase", "Mystery", "Darkness", "Escape", "Secrecy", "Danger"],
24
+ "Survival": ["Desperate", "Tense", "Adventurous", "Dangerous", "Frightening", "Desolate", "Primitive", "Stealthy", "Stagnant", "Clinical"],
25
+ "Post-apocalyptic": ["Struggle for Survival", "Beacon of Hope", "Mistrust and Suspicion", "Constant Danger", "Sole Survivor", "Gradual Recovery", "Rebellion Against Oppression", "Pockets of Serenity", "Nature's Emptiness", "Desperate Solidarity"],
26
+ "Historical Fiction": ["Anticipation", "Awe", "Tranquility", "Tension", "Festive", "Mysterious", "Unexpected", "Focused", "Dichotomy"]
27
+ }
28
+
29
+ jobs = {
30
+ "Middle Ages": ["Knight", "Archer", "Wizard/Mage", "Ruler", "Cleric/Priest", "Merchant", "Blacksmith", "Bard", "Barbarian", "Alchemist"],
31
+ "Cyberpunk": ["Hacker", "Bounty Hunter", "Corporate Executive", "Rebel", "Data Courier", "Cyborg", "Street Mercenary", "Investigative Journalist", "VR Designer", "Virtual Artist"],
32
+ "Science Fiction": ["Astronaut", "Space Engineer", "Exoplanet Researcher", "Xenobiologist", "Space Bounty Hunter", "Starship Explorer", "AI Developer", "Intergalactic Trader", "Galactic Diplomat", "Virtual Reality Game Developer"],
33
+ "Horror": ["Doctor", "Detective", "Artist", "Nurse", "Astrologer", "Shaman", "Exorcist", "Journalist", "Scientist", "Gravekeeper"],
34
+ "Romance": ["Novelist", "Florist", "Barista", "Violinist", "Actor", "Photographer", "Diary Keeper", "Fashion Designer", "Chef", "Traveler"],
35
+ "Mystery": ["Detective", "Investigative Journalist", "Crime Scene Investigator", "Mystery Novelist", "Defense Attorney", "Psychologist", "Archaeologist", "Secret Agent", "Hacker", "Museum Curator"],
36
+ "Thriller": ["Detective", "Journalist", "Forensic Scientist", "Hacker", "Police Officer", "Profiler", "Secret Agent", "Security Specialist", "Fraud Investigator", "Criminal Psychologist"],
37
+ "Survival": ["Explorer", "Marine", "Jungle Guide", "Rescue Worker", "Survivalist", "Mountaineer", "Diver", "Pilot", "Extreme Weather Researcher", "Hunter"],
38
+ "Post-apocalyptic": ["Scout", "Survivalist", "Archaeologist", "Trader", "Mechanic", "Medical Aid", "Militia Leader", "Craftsman", "Farmer", "Builder"],
39
+ "Historical Fiction": ["Knight", "Explorer", "Diplomat", "Historian", "General", "Monarch", "Merchant", "Archer", "Landlord", "Priest"]
40
+ }
41
+
42
+ ages = ["10s", "20s", "30s", "40s", "50s"]
43
+ mbtis = ["ESTJ", "ENTJ", "ESFJ", "ENFJ", "ISTJ", "ISFJ", "INTJ", "INFJ", "ESTP", "ESFP", "ENTP", "ENFP", "ISTP", "ISFP", "INTP", "INFP"]
44
+ random_names = ["Aaron", "Abigail", "Adam", "Adrian", "Alan", "Alexandra", "Alyssa", "Amanda", "Amber", "Amy", "Andrea", "Andrew", "Angela", "Angelina", "Anthony", "Antonio", "Ashley", "Austin", "Benjamin", "Brandon", "Brian", "Brittany", "Brooke", "Bruce", "Bryan", "Caleb", "Cameron", "Carol", "Caroline", "Catherine", "Charles", "Charlotte", "Chase", "Chelsea", "Christopher", "Cody", "Colin", "Connor", "Cooper", "Corey", "Cristian", "Daniel", "David", "Deborah", "Denise", "Dennis", "Derek", "Diana", "Dorothy", "Douglas", "Dylan", "Edward", "Elizabeth", "Emily", "Emma", "Eric", "Ethan", "Evan", "Gabriel", "Gavin", "George", "Gina", "Grace", "Gregory", "Hannah", "Harrison", "Hayden", "Heather", "Helen", "Henry", "Holly", "Hope", "Hunter", "Ian", "Isaac", "Isabella", "Jack", "Jacob", "James", "Jason", "Jeffrey", "Jenna", "Jennifer", "Jessica", "Jesse", "Joan", "John", "Jonathan", "Joseph", "Joshua", "Justin", "Kayla", "Kevin", "Kimberly", "Kyle", "Laura", "Lauren", "Lawrence", "Leah", "Leo", "Leslie", "Levi", "Lewis", "Liam", "Logan", "Lucas", "Lucy", "Luis", "Luke", "Madison", "Maegan", "Maria", "Mark", "Matthew", "Megan", "Michael", "Michelle", "Molly", "Morgan", "Nathan", "Nathaniel", "Nicholas", "Nicole", "Noah", "Olivia", "Owen", "Paige", "Parker", "Patrick", "Paul", "Peter", "Philip", "Phoebe", "Rachel", "Randy", "Rebecca", "Richard", "Robert", "Roger", "Ronald", "Rose", "Russell", "Ryan", "Samantha", "Samuel", "Sandra", "Sarah", "Scott", "Sean", "Sebastian", "Seth", "Shannon", "Shawn", "Shelby", "Sierra", "Simon", "Sophia", "Stephanie", "Stephen", "Steven", "Sue", "Susan", "Sydney", "Taylor", "Teresa", "Thomas", "Tiffany", "Timothy", "Todd", "Tom", "Tommy", "Tracy", "Travis", "Tyler", "Victoria", "Vincent", "Violet", "Warren", "William", "Zach", "Zachary", "Zoe"]
45
+ personalities = ['Optimistic', 'Kind', 'Resilient', 'Generous', 'Humorous', 'Creative', 'Empathetic', 'Ambitious', 'Adventurous']
46
+
47
+ default_character_images = ["assets/image.png"]
48
+
49
+ styles = ["sd character", "cartoon", "realistic"]
interfaces/chat_ui.py ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+
3
+ from interfaces import utils
4
+ from modules import palmchat
5
+
6
+ from pingpong import PingPong
7
+
8
+ def rollback_last_ui(history):
9
+ return history[:-1]
10
+
11
+ async def chat(
12
+ user_input, chat_mode, chat_state,
13
+ genre, place, mood,
14
+ name1, age1, mbti1, personality1, job1,
15
+ name2, age2, mbti2, personality2, job2,
16
+ name3, age3, mbti3, personality3, job3,
17
+ name4, age4, mbti4, personality4, job4,
18
+ chapter1_title, chapter2_title, chapter3_title, chapter4_title,
19
+ chapter1_plot, chapter2_plot, chapter3_plot, chapter4_plot
20
+ ):
21
+ chapter_title_ctx = ""
22
+ if chapter1_title != "":
23
+ chapter_title_ctx = f"""
24
+ chapter1 {{
25
+ title: {chapter1_title},
26
+ plot: {chapter1_plot}
27
+ }}
28
+
29
+ chapter2 {{
30
+ title: {chapter2_title},
31
+ plot: {chapter2_plot}
32
+ }}
33
+
34
+ chapter3 {{
35
+ title: {chapter3_title},
36
+ plot: {chapter3_plot}
37
+ }}
38
+
39
+ chapter4 {{
40
+ title: {chapter4_title},
41
+ plot: {chapter4_plot}
42
+ }}
43
+ """
44
+
45
+ ctx = f"""You are a professional writing advisor, especially specialized in developing ideas on plotting stories and creating characters. I provide genre, where, and mood along with the rough description of one main character and three side characters.
46
+
47
+ Give creative but not too long responses based on the following information.
48
+
49
+ genre: {genre}
50
+ where: {place}
51
+ mood: {mood}
52
+
53
+ main character: {{
54
+ name: {name1},
55
+ job: {job1},
56
+ age: {age1},
57
+ mbti: {mbti1},
58
+ personality: {personality1}
59
+ }}
60
+
61
+ side character1: {{
62
+ name: {name2},
63
+ job: {job2},
64
+ age: {age2},
65
+ mbti: {mbti2},
66
+ personality: {personality2}
67
+ }}
68
+
69
+ side character2: {{
70
+ name: {name3},
71
+ job: {job3},
72
+ age: {age3},
73
+ mbti: {mbti3},
74
+ personality: {personality3}
75
+ }}
76
+
77
+ side character3: {{
78
+ name: {name4},
79
+ job: {job4},
80
+ age: {age4},
81
+ mbti: {mbti4},
82
+ personality: {personality4}
83
+ }}
84
+
85
+ {chapter_title_ctx}
86
+ """
87
+
88
+ ppm = chat_state[chat_mode]
89
+ ppm.ctx = ctx
90
+ ppm.add_pingpong(
91
+ PingPong(user_input, '')
92
+ )
93
+ prompt = utils.build_prompts(ppm)
94
+
95
+ response_txt = await utils.get_chat_response(prompt, ctx=ctx)
96
+ ppm.replace_last_pong(response_txt)
97
+
98
+ chat_state[chat_mode] = ppm
99
+
100
+ return (
101
+ "",
102
+ chat_state,
103
+ ppm.build_uis(),
104
+ gr.update(interactive=True)
105
+ )
106
+
107
+ async def chat_regen(chat_mode, chat_state):
108
+ ppm = chat_state[chat_mode]
109
+
110
+ user_input = ppm.pingpongs[-1].ping
111
+ ppm.pingpongs = ppm.pingpongs[:-1]
112
+ ppm.add_pingpong(
113
+ PingPong(user_input, '')
114
+ )
115
+ prompt = utils.build_prompts(ppm)
116
+
117
+ response_txt = await utils.get_chat_response(prompt, ctx=ppm.ctx)
118
+ ppm.replace_last_pong(response_txt)
119
+
120
+ chat_state[chat_mode] = ppm
121
+
122
+ return (
123
+ chat_state,
124
+ ppm.build_uis()
125
+ )
126
+
127
+ def chat_reset(chat_mode, chat_state):
128
+ chat_state[chat_mode] = palmchat.GradioPaLMChatPPManager()
129
+
130
+ return (
131
+ "",
132
+ chat_state,
133
+ [],
134
+ gr.update(interactive=False)
135
+ )
interfaces/plot_gen_ui.py ADDED
@@ -0,0 +1,227 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import re
2
+ import gradio as gr
3
+ from interfaces import utils
4
+ from modules import palmchat
5
+
6
+ def _add_side_character(
7
+ enable, prompt, cur_side_chars,
8
+ name, age, mbti, personality, job
9
+ ):
10
+ if enable:
11
+ prompt = prompt + f"""
12
+ side character #{cur_side_chars}
13
+ - name: {name},
14
+ - job: {job},
15
+ - age: {age},
16
+ - mbti: {mbti},
17
+ - personality: {personality}
18
+
19
+ """
20
+ cur_side_chars = cur_side_chars + 1
21
+
22
+ return prompt, cur_side_chars
23
+
24
+
25
+ async def plot_gen(
26
+ temperature,
27
+ genre, place, mood,
28
+ side_char_enable1, side_char_enable2, side_char_enable3,
29
+ name1, age1, mbti1, personality1, job1,
30
+ name2, age2, mbti2, personality2, job2,
31
+ name3, age3, mbti3, personality3, job3,
32
+ name4, age4, mbti4, personality4, job4,
33
+ ):
34
+ cur_side_chars = 1
35
+ prompt = f"""Write a title and an outline of a novel based on the background information below in Ronald Tobias's plot theory. The outline should follow the "rising action", "crisis", "climax", "falling action", and "denouement" plot types. Each should be filled with a VERY detailed and descriptive at least two paragraphs of string. Randomly choose if the story goes optimistic or tragic.
36
+
37
+ background information:
38
+ - genre: string
39
+ - where: string
40
+ - mood: string
41
+
42
+ main character
43
+ - name: string
44
+ - job: string
45
+ - age: string
46
+ - mbti: string
47
+ - personality: string
48
+
49
+ JSON output:
50
+ {{
51
+ "title": "string",
52
+ "outline": {{
53
+ "rising action": "paragraphs of string",
54
+ "crisis": "paragraphs of string",
55
+ "climax": "paragraphs of string",
56
+ "falling action": "paragraphs of string",
57
+ "denouement": "paragraphs of string"
58
+ }}
59
+ }}
60
+
61
+ background information:
62
+ - genre: {genre}
63
+ - where: {place}
64
+ - mood: {mood}
65
+
66
+ main character
67
+ - name: {name1}
68
+ - job: {job1}
69
+ - age: {age1}
70
+ - mbti: {mbti1}
71
+ - personality: {personality1}
72
+
73
+ """
74
+
75
+ prompt, cur_side_chars = _add_side_character(
76
+ side_char_enable1, prompt, cur_side_chars,
77
+ name2, job2, age2, mbti2, personality2
78
+ )
79
+ prompt, cur_side_chars = _add_side_character(
80
+ side_char_enable2, prompt, cur_side_chars,
81
+ name3, job3, age3, mbti3, personality3
82
+ )
83
+ prompt, cur_side_chars = _add_side_character(
84
+ side_char_enable3, prompt, cur_side_chars,
85
+ name4, job4, age4, mbti4, personality4
86
+ )
87
+
88
+ prompt = prompt + "JSON output:\n"
89
+
90
+ print(f"generated prompt:\n{prompt}")
91
+ parameters = {
92
+ 'model': 'models/text-bison-001',
93
+ 'candidate_count': 1,
94
+ 'temperature': temperature,
95
+ 'top_k': 40,
96
+ 'top_p': 1,
97
+ 'max_output_tokens': 4096,
98
+ }
99
+ response_json = await utils.retry_until_valid_json(prompt, parameters=parameters)
100
+
101
+ return (
102
+ response_json['title'],
103
+ f"## {response_json['title']}",
104
+ response_json['outline']['rising action'],
105
+ response_json['outline']['crisis'],
106
+ response_json['outline']['climax'],
107
+ response_json['outline']['falling action'],
108
+ response_json['outline']['denouement'],
109
+ )
110
+
111
+
112
+ async def first_story_gen(
113
+ title,
114
+ rising_action, crisis, climax, falling_action, denouement,
115
+ genre, place, mood,
116
+ side_char_enable1, side_char_enable2, side_char_enable3,
117
+ name1, age1, mbti1, personality1, job1,
118
+ name2, age2, mbti2, personality2, job2,
119
+ name3, age3, mbti3, personality3, job3,
120
+ name4, age4, mbti4, personality4, job4,
121
+ cursors, cur_cursor
122
+ ):
123
+ cur_side_chars = 1
124
+
125
+ prompt = f"""Write the chapter title and the first few paragraphs of the "rising action" plot based on the background information below in Ronald Tobias's plot theory. Also, suggest three choosable actions to drive current story in different directions. The first few paragraphs should be filled with a VERY MUCH detailed and descriptive at least two paragraphs of string.
126
+
127
+ REMEMBER the first few paragraphs should not end the whole story and allow leaway for the next paragraphs to come.
128
+ The whole story SHOULD stick to the "rising action -> crisis -> climax -> falling action -> denouement" flow, so REMEMBER not to write anything mentioned from the next plots of crisis, climax, falling action, and denouement yet.
129
+
130
+ background information:
131
+ - genre: string
132
+ - where: string
133
+ - mood: string
134
+
135
+ main character
136
+ - name: string
137
+ - job: string
138
+ - age: string
139
+ - mbti: string
140
+ - personality: string
141
+
142
+ overall outline
143
+ - title: string
144
+ - rising action: string
145
+ - crisis: string
146
+ - climax: string
147
+ - falling action: string
148
+ - denouement: string
149
+
150
+ JSON output:
151
+ {{
152
+ "chapter_title": "string",
153
+ "paragraphs": ["string", "string", ...],
154
+ "actions": ["string", "string", "string"]
155
+ }}
156
+
157
+ background information:
158
+ - genre: {genre}
159
+ - where: {place}
160
+ - mood: {mood}
161
+
162
+ main character
163
+ - name: {name1}
164
+ - job: {job1},
165
+ - age: {age1},
166
+ - mbti: {mbti1},
167
+ - personality: {personality1}
168
+
169
+ """
170
+
171
+ prompt, cur_side_chars = _add_side_character(
172
+ side_char_enable1, prompt, cur_side_chars,
173
+ name2, job2, age2, mbti2, personality2
174
+ )
175
+ prompt, cur_side_chars = _add_side_character(
176
+ side_char_enable2, prompt, cur_side_chars,
177
+ name3, job3, age3, mbti3, personality3
178
+ )
179
+ prompt, cur_side_chars = _add_side_character(
180
+ side_char_enable3, prompt, cur_side_chars,
181
+ name4, job4, age4, mbti4, personality4
182
+ )
183
+
184
+ prompt = prompt + f"""
185
+ overall outline
186
+ - title: {title}
187
+ - rising action: {rising_action}
188
+ - crisis: {crisis}
189
+ - climax: {climax}
190
+ - falling action: {falling_action}
191
+ - denouement: {denouement}
192
+
193
+ JSON output:
194
+ """
195
+
196
+ print(f"generated prompt:\n{prompt}")
197
+ parameters = {
198
+ 'model': 'models/text-bison-001',
199
+ 'candidate_count': 1,
200
+ 'temperature': 1,
201
+ 'top_k': 40,
202
+ 'top_p': 1,
203
+ 'max_output_tokens': 4096,
204
+ }
205
+ response_json = await utils.retry_until_valid_json(prompt, parameters=parameters)
206
+
207
+ chapter_title = response_json["chapter_title"]
208
+ pattern = r"Chapter\s+\d+\s*[:.]"
209
+ chapter_title = re.sub(pattern, "", chapter_title)
210
+
211
+ cursors.append({
212
+ "title": chapter_title,
213
+ "plot_type": "rising action",
214
+ "story": "\n\n".join(response_json["paragraphs"])
215
+ })
216
+
217
+ return (
218
+ f"### {chapter_title} (\"rising action\")",
219
+ "\n\n".join(response_json["paragraphs"]),
220
+ cursors,
221
+ cur_cursor,
222
+ gr.update(interactive=True),
223
+ gr.update(interactive=True),
224
+ gr.update(value=response_json["actions"][0], interactive=True),
225
+ gr.update(value=response_json["actions"][1], interactive=True),
226
+ gr.update(value=response_json["actions"][2], interactive=True),
227
+ )
interfaces/story_gen_ui.py ADDED
@@ -0,0 +1,476 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import re
2
+ import copy
3
+ import random
4
+ import gradio as gr
5
+ from gradio_client import Client
6
+ from pathlib import Path
7
+
8
+ from modules import (
9
+ ImageMaker, MusicMaker, palmchat, merge_video
10
+ )
11
+ from interfaces import utils
12
+
13
+ from pingpong import PingPong
14
+ from pingpong.context import CtxLastWindowStrategy
15
+
16
+ # TODO: Replace checkpoint filename to Huggingface URL
17
+ img_maker = ImageMaker('landscapeAnimePro_v20Inspiration.safetensors', vae="cute20vae.safetensors")
18
+ #img_maker = ImageMaker('fantasyworldFp16.safetensors', vae="cute20vae.safetensors")
19
+ #img_maker = ImageMaker('forgesagalandscapemi.safetensors', vae="anythingFp16.safetensors")
20
+ bgm_maker = MusicMaker(model_size='large', output_format='mp3')
21
+
22
+ video_gen_client_url = "https://0447df3cf5f7c49c46.gradio.live"
23
+
24
+ async def update_story_gen(
25
+ cursors, cur_cursor_idx,
26
+ genre, place, mood,
27
+ main_char_name, main_char_age, main_char_mbti, main_char_personality, main_char_job,
28
+ side_char_enable1, side_char_name1, side_char_age1, side_char_mbti1, side_char_personality1, side_char_job1,
29
+ side_char_enable2, side_char_name2, side_char_age2, side_char_mbti2, side_char_personality2, side_char_job2,
30
+ side_char_enable3, side_char_name3, side_char_age3, side_char_mbti3, side_char_personality3, side_char_job3,
31
+ ):
32
+ if len(cursors) == 1:
33
+ return await first_story_gen(
34
+ cursors,
35
+ genre, place, mood,
36
+ main_char_name, main_char_age, main_char_mbti, main_char_personality, main_char_job,
37
+ side_char_enable1, side_char_name1, side_char_age1, side_char_mbti1, side_char_personality1, side_char_job1,
38
+ side_char_enable2, side_char_name2, side_char_age2, side_char_mbti2, side_char_personality2, side_char_job2,
39
+ side_char_enable3, side_char_name3, side_char_age3, side_char_mbti3, side_char_personality3, side_char_job3,
40
+ cur_cursor_idx=cur_cursor_idx
41
+ )
42
+ else:
43
+ return await next_story_gen(
44
+ cursors,
45
+ None,
46
+ genre, place, mood,
47
+ main_char_name, main_char_age, main_char_mbti, main_char_personality, main_char_job,
48
+ side_char_enable1, side_char_name1, side_char_age1, side_char_mbti1, side_char_personality1, side_char_job1,
49
+ side_char_enable2, side_char_name2, side_char_age2, side_char_mbti2, side_char_personality2, side_char_job2,
50
+ side_char_enable3, side_char_name3, side_char_age3, side_char_mbti3, side_char_personality3, side_char_job3,
51
+ cur_cursor_idx=cur_cursor_idx
52
+ )
53
+
54
+ async def next_story_gen(
55
+ cursors,
56
+ action,
57
+ genre, place, mood,
58
+ main_char_name, main_char_age, main_char_mbti, main_char_personality, main_char_job,
59
+ side_char_enable1, side_char_name1, side_char_age1, side_char_mbti1, side_char_personality1, side_char_job1,
60
+ side_char_enable2, side_char_name2, side_char_age2, side_char_mbti2, side_char_personality2, side_char_job2,
61
+ side_char_enable3, side_char_name3, side_char_age3, side_char_mbti3, side_char_personality3, side_char_job3,
62
+ cur_cursor_idx=None
63
+ ):
64
+ stories = ""
65
+ cur_side_chars = 1
66
+
67
+ action = cursors[cur_cursor_idx]["action"] if cur_cursor_idx is not None else action
68
+ end_idx = len(cursors) if cur_cursor_idx is None else len(cursors)-1
69
+
70
+ for cursor in cursors[:end_idx]:
71
+ stories = stories + cursor["story"]
72
+
73
+ prompt = f"""Write the next paragraphs. The next paragraphs should be determined by an option and well connected to the current stories.
74
+
75
+ background information:
76
+ - genre: {genre}
77
+ - where: {place}
78
+ - mood: {mood}
79
+
80
+ main character
81
+ - name: {main_char_name}
82
+ - job: {main_char_job}
83
+ - age: {main_char_age}
84
+ - mbti: {main_char_mbti}
85
+ - personality: {main_char_personality}
86
+ """
87
+
88
+ prompt, cur_side_chars = utils.add_side_character(
89
+ side_char_enable1, prompt, cur_side_chars,
90
+ side_char_name1, side_char_job1, side_char_age1, side_char_mbti1, side_char_personality1
91
+ )
92
+ prompt, cur_side_chars = utils.add_side_character(
93
+ side_char_enable2, prompt, cur_side_chars,
94
+ side_char_name2, side_char_job2, side_char_age2, side_char_mbti2, side_char_personality2
95
+ )
96
+ prompt, cur_side_chars = utils.add_side_character(
97
+ side_char_enable3, prompt, cur_side_chars,
98
+ side_char_name3, side_char_job3, side_char_age3, side_char_mbti3, side_char_personality3
99
+ )
100
+
101
+ prompt = prompt + f"""
102
+ stories
103
+ {stories}
104
+
105
+ option to the next stories: {action}
106
+
107
+ Fill in the following JSON output format:
108
+ {{
109
+ "paragraphs": "string"
110
+ }}
111
+
112
+ """
113
+
114
+ print(f"generated prompt:\n{prompt}")
115
+ parameters = {
116
+ 'model': 'models/text-bison-001',
117
+ 'candidate_count': 1,
118
+ 'temperature': 1.0,
119
+ 'top_k': 40,
120
+ 'top_p': 1,
121
+ 'max_output_tokens': 4096,
122
+ }
123
+ response_json = await utils.retry_until_valid_json(prompt, parameters=parameters)
124
+
125
+ story = response_json["paragraphs"]
126
+ if isinstance(story, list):
127
+ story = "\n\n".join(story)
128
+
129
+ if cur_cursor_idx is None:
130
+ cursors.append({
131
+ "title": "",
132
+ "story": story,
133
+ "action": action
134
+ })
135
+ else:
136
+ cursors[cur_cursor_idx]["story"] = story
137
+ cursors[cur_cursor_idx]["action"] = action
138
+
139
+ return (
140
+ cursors, len(cursors)-1,
141
+ story,
142
+ gr.update(
143
+ maximum=len(cursors), value=len(cursors),
144
+ label=f"{len(cursors)} out of {len(cursors)} stories",
145
+ visible=True, interactive=True
146
+ ),
147
+ gr.update(interactive=True),
148
+ gr.update(interactive=True),
149
+ gr.update(value=None, visible=False, interactive=True),
150
+ gr.update(value=None, visible=False, interactive=True),
151
+ gr.update(value=None, visible=False, interactive=True),
152
+ )
153
+
154
+ async def actions_gen(
155
+ cursors,
156
+ genre, place, mood,
157
+ main_char_name, main_char_age, main_char_mbti, main_char_personality, main_char_job,
158
+ side_char_enable1, side_char_name1, side_char_age1, side_char_mbti1, side_char_personality1, side_char_job1,
159
+ side_char_enable2, side_char_name2, side_char_age2, side_char_mbti2, side_char_personality2, side_char_job2,
160
+ side_char_enable3, side_char_name3, side_char_age3, side_char_mbti3, side_char_personality3, side_char_job3,
161
+ cur_cursor_idx=None
162
+ ):
163
+ stories = ""
164
+ cur_side_chars = 1
165
+ end_idx = len(cursors) if cur_cursor_idx is None else len(cursors)-1
166
+
167
+ for cursor in cursors[:end_idx]:
168
+ stories = stories + cursor["story"]
169
+
170
+ summary_prompt = f"""Summarize the text below
171
+
172
+ {stories}
173
+
174
+ """
175
+ print(f"generated prompt:\n{summary_prompt}")
176
+ parameters = {
177
+ 'model': 'models/text-bison-001',
178
+ 'candidate_count': 1,
179
+ 'temperature': 1.0,
180
+ 'top_k': 40,
181
+ 'top_p': 1,
182
+ 'max_output_tokens': 4096,
183
+ }
184
+ _, summary = await palmchat.gen_text(summary_prompt, mode="text", parameters=parameters)
185
+
186
+ prompt = f"""Suggest the 30 options to drive the stories to the next based on the information below.
187
+
188
+ background information:
189
+ - genre: {genre}
190
+ - where: {place}
191
+ - mood: {mood}
192
+
193
+ main character
194
+ - name: {main_char_name}
195
+ - job: {main_char_job}
196
+ - age: {main_char_age}
197
+ - mbti: {main_char_mbti}
198
+ - personality: {main_char_personality}
199
+ """
200
+ prompt, cur_side_chars = utils.add_side_character(
201
+ side_char_enable1, prompt, cur_side_chars,
202
+ side_char_name1, side_char_job1, side_char_age1, side_char_mbti1, side_char_personality1
203
+ )
204
+ prompt, cur_side_chars = utils.add_side_character(
205
+ side_char_enable2, prompt, cur_side_chars,
206
+ side_char_name2, side_char_job2, side_char_age2, side_char_mbti2, side_char_personality2
207
+ )
208
+ prompt, cur_side_chars = utils.add_side_character(
209
+ side_char_enable3, prompt, cur_side_chars,
210
+ side_char_name3, side_char_job3, side_char_age3, side_char_mbti3, side_char_personality3
211
+ )
212
+
213
+ prompt = prompt + f"""
214
+ summary of the story
215
+ {summary}
216
+
217
+ Fill in the following JSON output format:
218
+ {{
219
+ "options": ["string", "string", "string", ...]
220
+ }}
221
+
222
+ """
223
+
224
+ print(f"generated prompt:\n{prompt}")
225
+ parameters = {
226
+ 'model': 'models/text-bison-001',
227
+ 'candidate_count': 1,
228
+ 'temperature': 1.0,
229
+ 'top_k': 40,
230
+ 'top_p': 1,
231
+ 'max_output_tokens': 4096,
232
+ }
233
+ response_json = await utils.retry_until_valid_json(prompt, parameters=parameters)
234
+ actions = response_json["options"]
235
+
236
+ random_actions = random.sample(actions, 3)
237
+
238
+ return (
239
+ gr.update(value=random_actions[0], interactive=True),
240
+ gr.update(value=random_actions[1], interactive=True),
241
+ gr.update(value=random_actions[2], interactive=True),
242
+ " "
243
+ )
244
+
245
+ async def first_story_gen(
246
+ cursors,
247
+ genre, place, mood,
248
+ main_char_name, main_char_age, main_char_mbti, main_char_personality, main_char_job,
249
+ side_char_enable1, side_char_name1, side_char_age1, side_char_mbti1, side_char_personality1, side_char_job1,
250
+ side_char_enable2, side_char_name2, side_char_age2, side_char_mbti2, side_char_personality2, side_char_job2,
251
+ side_char_enable3, side_char_name3, side_char_age3, side_char_mbti3, side_char_personality3, side_char_job3,
252
+ cur_cursor_idx=None
253
+ ):
254
+ cur_side_chars = 1
255
+
256
+ prompt = f"""Write the first three paragraphs of a novel as much detailed as possible. They should be based on the background information. Blend 5W1H principle into the stories as a plain text. Don't let the paragraphs end the whole story.
257
+
258
+ background information:
259
+ - genre: {genre}
260
+ - where: {place}
261
+ - mood: {mood}
262
+
263
+ main character
264
+ - name: {main_char_name}
265
+ - job: {main_char_job}
266
+ - age: {main_char_age}
267
+ - mbti: {main_char_mbti}
268
+ - personality: {main_char_personality}
269
+ """
270
+
271
+ prompt, cur_side_chars = utils.add_side_character(
272
+ side_char_enable1, prompt, cur_side_chars,
273
+ side_char_name1, side_char_job1, side_char_age1, side_char_mbti1, side_char_personality1
274
+ )
275
+ prompt, cur_side_chars = utils.add_side_character(
276
+ side_char_enable2, prompt, cur_side_chars,
277
+ side_char_name2, side_char_job2, side_char_age2, side_char_mbti2, side_char_personality2
278
+ )
279
+ prompt, cur_side_chars = utils.add_side_character(
280
+ side_char_enable3, prompt, cur_side_chars,
281
+ side_char_name3, side_char_job3, side_char_age3, side_char_mbti3, side_char_personality3
282
+ )
283
+
284
+ prompt = prompt + f"""
285
+ Fill in the following JSON output format:
286
+ {{
287
+ "paragraphs": "string"
288
+ }}
289
+
290
+ """
291
+
292
+ print(f"generated prompt:\n{prompt}")
293
+ parameters = {
294
+ 'model': 'models/text-bison-001',
295
+ 'candidate_count': 1,
296
+ 'temperature': 1.0,
297
+ 'top_k': 40,
298
+ 'top_p': 1,
299
+ 'max_output_tokens': 4096,
300
+ }
301
+ response_json = await utils.retry_until_valid_json(prompt, parameters=parameters)
302
+
303
+ story = response_json["paragraphs"]
304
+ if isinstance(story, list):
305
+ story = "\n\n".join(story)
306
+
307
+ if cur_cursor_idx is None:
308
+ cursors.append({
309
+ "title": "",
310
+ "story": story
311
+ })
312
+ else:
313
+ cursors[cur_cursor_idx]["story"] = story
314
+
315
+ return (
316
+ cursors, len(cursors)-1,
317
+ story,
318
+ gr.update(
319
+ maximum=len(cursors), value=len(cursors),
320
+ label=f"{len(cursors)} out of {len(cursors)} stories",
321
+ visible=False if len(cursors) == 1 else True, interactive=True
322
+ ),
323
+ gr.update(interactive=True),
324
+ gr.update(interactive=True),
325
+ gr.update(value=None, visible=False, interactive=True),
326
+ gr.update(value=None, visible=False, interactive=True),
327
+ gr.update(value=None, visible=False, interactive=True),
328
+ )
329
+
330
+ def video_gen(
331
+ image, audio, title, cursors, cur_cursor, use_ffmpeg=True
332
+ ):
333
+ if use_ffmpeg:
334
+ output_filename = merge_video(image, audio, story_title="")
335
+
336
+ if not use_ffmpeg or not output_filename:
337
+ client = Client(video_gen_client_url)
338
+ result = client.predict(
339
+ "",
340
+ audio,
341
+ image,
342
+ f"{utils.id_generator()}.mp4",
343
+ api_name="/predict"
344
+ )
345
+ output_filename = result[0]
346
+
347
+ cursors[cur_cursor]["video"] = output_filename
348
+
349
+ return (
350
+ gr.update(visible=False),
351
+ gr.update(visible=False),
352
+ gr.update(visible=True, value=output_filename),
353
+ cursors,
354
+ " "
355
+ )
356
+
357
+
358
+ def image_gen(
359
+ genre, place, mood, title, story_content, cursors, cur_cursor, story_audio
360
+ ):
361
+ # generate prompts for background image with PaLM
362
+ for _ in range(3):
363
+ try:
364
+ prompt, neg_prompt = img_maker.generate_background_prompts(genre, place, mood, title, "", story_content)
365
+ neg_prompt
366
+ print(f"Image Prompt: {prompt}")
367
+ print(f"Negative Prompt: {neg_prompt}")
368
+ break
369
+ except Exception as e:
370
+ print(e)
371
+
372
+ if not prompt:
373
+ raise ValueError("Failed to generate prompts for background image.")
374
+
375
+ # generate image
376
+ try:
377
+ img_filename = img_maker.text2image(prompt, neg_prompt=neg_prompt, ratio='16:9', cfg=6.5)
378
+ except ValueError as e:
379
+ print(e)
380
+ img_filename = str(Path('.') / 'assets' / 'nsfw_warning_wide.png')
381
+
382
+ cursors[cur_cursor]["img"] = img_filename
383
+
384
+ video_gen_btn_state = gr.update(interactive=False)
385
+ if story_audio is not None:
386
+ video_gen_btn_state = gr.update(interactive=True)
387
+
388
+ return (
389
+ gr.update(visible=True, value=img_filename),
390
+ video_gen_btn_state,
391
+ cursors,
392
+ " "
393
+ )
394
+
395
+
396
+ def audio_gen(
397
+ genre, place, mood, title, story_content, cursors, cur_cursor, story_image
398
+ ):
399
+ # generate prompt for background music with PaLM
400
+ for _ in range(3):
401
+ try:
402
+ prompt = bgm_maker.generate_prompt(genre, place, mood, title, "", story_content)
403
+ print(f"Music Prompt: {prompt}")
404
+ break
405
+ except Exception as e:
406
+ print(e)
407
+
408
+ if not prompt:
409
+ raise ValueError("Failed to generate prompt for background music.")
410
+
411
+ # generate music
412
+ bgm_filename = bgm_maker.text2music(prompt, length=60)
413
+ cursors[cur_cursor]["audio"] = bgm_filename
414
+
415
+ video_gen_btn_state = gr.update(interactive=False)
416
+ if story_image is not None:
417
+ video_gen_btn_state = gr.update(interactive=True)
418
+
419
+ return (
420
+ gr.update(visible=True, value=bgm_filename),
421
+ video_gen_btn_state,
422
+ cursors,
423
+ " "
424
+ )
425
+
426
+ def move_story_cursor(moved_cursor, cursors):
427
+ cursor_content = cursors[moved_cursor-1]
428
+ max_cursor = len(cursors)
429
+
430
+ action_btn = (
431
+ gr.update(interactive=False),
432
+ gr.update(interactive=False),
433
+ gr.update(interactive=False)
434
+ )
435
+
436
+ if moved_cursor == max_cursor:
437
+ action_btn = (
438
+ gr.update(interactive=True),
439
+ gr.update(interactive=True),
440
+ gr.update(interactive=True)
441
+ )
442
+
443
+ if "video" in cursor_content:
444
+ outputs = (
445
+ moved_cursor-1,
446
+ gr.update(label=f"{moved_cursor} out of {len(cursors)} chapters"),
447
+ cursor_content["story"],
448
+ gr.update(value=None, visible=False),
449
+ gr.update(value=None, visible=False),
450
+ gr.update(value=cursor_content["video"], visible=True),
451
+ )
452
+
453
+ else:
454
+ image_container = gr.update(value=None, visible=False)
455
+ audio_container = gr.update(value=None, visible=False)
456
+
457
+ if "img" in cursor_content:
458
+ image_container = gr.update(value=cursor_content["img"], visible=True)
459
+
460
+ if "audio" in cursor_content:
461
+ audio_container = gr.update(value=cursor_content["audio"], visible=True)
462
+
463
+ outputs = (
464
+ moved_cursor-1,
465
+ gr.update(label=f"{moved_cursor} out of {len(cursors)} stories"),
466
+ cursor_content["story"],
467
+ image_container,
468
+ audio_container,
469
+ gr.update(value=None, visible=False),
470
+ )
471
+
472
+ return outputs + action_btn
473
+
474
+ def update_story_content(story_content, cursors, cur_cursor):
475
+ cursors[cur_cursor]["story"] = story_content
476
+ return cursors
interfaces/ui.py ADDED
@@ -0,0 +1,93 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import copy
2
+ import random
3
+ import gradio as gr
4
+
5
+ import numpy
6
+ import PIL
7
+ from pathlib import Path
8
+
9
+ from constants.init_values import (
10
+ places, moods, jobs, random_names, default_character_images
11
+ )
12
+
13
+ from modules import (
14
+ ImageMaker, palmchat
15
+ )
16
+
17
+ from interfaces import utils
18
+
19
+ # TODO: Replace checkpoint filename to Huggingface URL
20
+ #img_maker = ImageMaker('hellonijicute25d_V10b.safetensors', vae="kl-f8-anime2.vae.safetensors")
21
+ img_maker = ImageMaker('hellonijicute25d_V10b.safetensors') # without_VAE
22
+
23
+ ############
24
+ # for plotting
25
+
26
+ def get_random_name(cur_char_name, char_name1, char_name2, char_name3):
27
+ tmp_random_names = copy.deepcopy(random_names)
28
+ tmp_random_names.remove(cur_char_name)
29
+ tmp_random_names.remove(char_name1)
30
+ tmp_random_names.remove(char_name2)
31
+ tmp_random_names.remove(char_name3)
32
+ return random.choice(tmp_random_names)
33
+
34
+
35
+ def gen_character_image(
36
+ gallery_images,
37
+ name, age, mbti, personality, job,
38
+ genre, place, mood, creative_mode
39
+ ):
40
+ # generate prompts for character image with PaLM
41
+ for _ in range(3):
42
+ try:
43
+ prompt, neg_prompt = img_maker.generate_character_prompts(name, age, job, keywords=[mbti, personality, genre, place, mood], creative_mode=creative_mode)
44
+ print(f"Image Prompt: {prompt}")
45
+ print(f"Negative Prompt: {neg_prompt}")
46
+ break
47
+ except Exception as e:
48
+ print(e)
49
+
50
+ if not prompt:
51
+ raise ValueError("Failed to generate prompts for character image.")
52
+
53
+ # generate image
54
+ try:
55
+ img_filename = img_maker.text2image(prompt, neg_prompt=neg_prompt, ratio='3:4', cfg=4.5)
56
+ except ValueError as e:
57
+ print(e)
58
+ img_filename = str(Path('.') / 'assets' / 'nsfw_warning.png')
59
+
60
+ # update gallery
61
+ gen_image = numpy.asarray(PIL.Image.open(img_filename))
62
+ gallery_images.insert(0, gen_image)
63
+
64
+ return gr.update(value=gallery_images), gallery_images
65
+
66
+
67
+ def update_on_age(evt: gr.SelectData):
68
+ job_list = jobs[evt.value]
69
+
70
+ return (
71
+ gr.update(value=places[evt.value][0], choices=places[evt.value]),
72
+ gr.update(value=moods[evt.value][0], choices=moods[evt.value]),
73
+ gr.update(value=job_list[0], choices=job_list),
74
+ gr.update(value=job_list[0], choices=job_list),
75
+ gr.update(value=job_list[0], choices=job_list),
76
+ gr.update(value=job_list[0], choices=job_list)
77
+ )
78
+
79
+ ############
80
+ # for tabbing
81
+
82
+ def update_on_main_tabs(chat_state, evt: gr.SelectData):
83
+ chat_mode = "plot_chat"
84
+
85
+ if evt.value.lower() == "background setup":
86
+ chat_mode = "plot_chat"
87
+ elif evt.value.lower() == "story generation":
88
+ chat_mode = "story_chat"
89
+ else: # export
90
+ chat_mode = "export_chat"
91
+
92
+ ppm = chat_state[chat_mode]
93
+ return chat_mode, ppm.build_uis()
interfaces/utils.py ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import copy
2
+ import json
3
+ import string
4
+ import random
5
+
6
+ from modules import palmchat
7
+ from pingpong.context import CtxLastWindowStrategy
8
+
9
+ def add_side_character(
10
+ enable, prompt, cur_side_chars,
11
+ name, age, mbti, personality, job
12
+ ):
13
+ if enable:
14
+ prompt = prompt + f"""
15
+ side character #{cur_side_chars}
16
+ - name: {name},
17
+ - job: {job},
18
+ - age: {age},
19
+ - mbti: {mbti},
20
+ - personality: {personality}
21
+
22
+ """
23
+ cur_side_chars = cur_side_chars + 1
24
+
25
+ return prompt, cur_side_chars
26
+
27
+ def id_generator(size=6, chars=string.ascii_uppercase + string.digits):
28
+ return ''.join(random.choice(chars) for _ in range(size))
29
+
30
+ def parse_first_json_code_snippet(code_snippet):
31
+ json_parsed_string = None
32
+
33
+ try:
34
+ json_parsed_string = json.loads(code_snippet, strict=False)
35
+ except:
36
+ json_start_index = code_snippet.find('```json')
37
+ json_end_index = code_snippet.find('```', json_start_index + 6)
38
+
39
+ if json_start_index < 0 or json_end_index < 0:
40
+ raise ValueError('No JSON code snippet found in string.')
41
+
42
+ json_code_snippet = code_snippet[json_start_index + 7:json_end_index]
43
+ json_parsed_string = json.loads(json_code_snippet, strict=False)
44
+ finally:
45
+ return json_parsed_string
46
+
47
+ async def retry_until_valid_json(prompt, parameters=None):
48
+ response_json = None
49
+ while response_json is None:
50
+ _, response_txt = await palmchat.gen_text(prompt, mode="text", parameters=parameters)
51
+ print(response_txt)
52
+
53
+ try:
54
+ response_json = parse_first_json_code_snippet(response_txt)
55
+ except:
56
+ pass
57
+
58
+ return response_json
59
+
60
+ def build_prompts(ppm, win_size=3):
61
+ dummy_ppm = copy.deepcopy(ppm)
62
+ lws = CtxLastWindowStrategy(win_size)
63
+ return lws(dummy_ppm)
64
+
65
+ async def get_chat_response(prompt, ctx=None):
66
+ parameters = {
67
+ 'model': 'models/chat-bison-001',
68
+ 'candidate_count': 1,
69
+ 'context': "" if ctx is None else ctx,
70
+ 'temperature': 1.0,
71
+ 'top_k': 50,
72
+ 'top_p': 0.9,
73
+ }
74
+
75
+ _, response_txt = await palmchat.gen_text(
76
+ prompt,
77
+ parameters=parameters
78
+ )
79
+
80
+ return response_txt
interfaces/view_change_ui.py ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+
3
+ def move_to_next_view():
4
+ return (
5
+ gr.update(visible=False),
6
+ gr.update(visible=True),
7
+ )
8
+
9
+ def back_to_previous_view():
10
+ return (
11
+ gr.update(visible=True),
12
+ gr.update(visible=False),
13
+ )
modules/__init__.py ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ from .image_maker import ImageMaker
2
+ from .music_maker import MusicMaker
3
+ from .palmchat import (
4
+ PaLMChatPromptFmt,
5
+ PaLMChatPPManager,
6
+ GradioPaLMChatPPManager,
7
+ )
8
+ from .utils import (
9
+ merge_video,
10
+ )
modules/image_maker.py ADDED
@@ -0,0 +1,356 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import Literal
2
+ from pathlib import Path
3
+
4
+ import uuid
5
+ import json
6
+ import re
7
+ import asyncio
8
+ import toml
9
+
10
+ import torch
11
+ from compel import Compel
12
+
13
+ from diffusers import (
14
+ DiffusionPipeline,
15
+ StableDiffusionPipeline,
16
+ AutoencoderKL,
17
+ DPMSolverMultistepScheduler,
18
+ DDPMScheduler,
19
+ DPMSolverSinglestepScheduler,
20
+ DPMSolverSDEScheduler,
21
+ DEISMultistepScheduler,
22
+ )
23
+
24
+ from .utils import (
25
+ set_all_seeds,
26
+ )
27
+ from .palmchat import (
28
+ palm_prompts,
29
+ gen_text,
30
+ )
31
+
32
+ _gpus = 0
33
+
34
+ class ImageMaker:
35
+ # TODO: DocString...
36
+ """Class for generating images from prompts."""
37
+
38
+ __ratio = {'3:2': [768, 512],
39
+ '4:3': [680, 512],
40
+ '16:9': [912, 512],
41
+ '1:1': [512, 512],
42
+ '9:16': [512, 912],
43
+ '3:4': [512, 680],
44
+ '2:3': [512, 768]}
45
+ __allocated = False
46
+
47
+ def __init__(self, model_base: str,
48
+ clip_skip: int = 2,
49
+ sampling: Literal['sde-dpmsolver++'] = 'sde-dpmsolver++',
50
+ vae: str = None,
51
+ safety: bool = True,
52
+ neg_prompt: str = None,
53
+ device: str = None) -> None:
54
+ """Initialize the ImageMaker class.
55
+
56
+ Args:
57
+ model_base (str): Filename of the model base.
58
+ clip_skip (int, optional): Number of layers to skip in the clip model. Defaults to 2.
59
+ sampling (Literal['sde-dpmsolver++'], optional): Sampling method. Defaults to 'sde-dpmsolver++'.
60
+ vae (str, optional): Filename of the VAE model. Defaults to None.
61
+ safety (bool, optional): Whether to use the safety checker. Defaults to True.
62
+ device (str, optional): Device to use for the model. Defaults to None.
63
+ """
64
+
65
+ self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') if not device else device
66
+ self.__model_base = model_base
67
+ self.__clip_skip = clip_skip
68
+ self.__sampling = sampling
69
+ self.__vae = vae
70
+ self.__safety = safety
71
+ self.neg_prompt = neg_prompt
72
+
73
+ print("Loading the Stable Diffusion model into memory...")
74
+ self.__sd_model = StableDiffusionPipeline.from_single_file(self.model_base,
75
+ #torch_dtype=torch.float16,
76
+ use_safetensors=True)
77
+
78
+ # Clip Skip
79
+ self.__sd_model.text_encoder.text_model.encoder.layers = self.__sd_model.text_encoder.text_model.encoder.layers[:12 - (self.clip_skip - 1)]
80
+
81
+ # Sampling method
82
+ if True: # TODO: Sampling method :: self.sampling == 'sde-dpmsolver++'
83
+ scheduler = DPMSolverMultistepScheduler.from_config(self.__sd_model.scheduler.config)
84
+ scheduler.config.algorithm_type = 'sde-dpmsolver++'
85
+ self.__sd_model.scheduler = scheduler
86
+
87
+ # TODO: Use LoRA
88
+
89
+ # VAE
90
+ if self.vae:
91
+ vae_model = AutoencoderKL.from_single_file(self.vae)
92
+ self.__sd_model.vae = vae_model
93
+
94
+ if not self.safety:
95
+ self.__sd_model.safety_checker = None
96
+ self.__sd_model.requires_safety_checker = False
97
+
98
+ print(f"Loaded model to {self.device}")
99
+ self.__sd_model = self.__sd_model.to(self.device)
100
+
101
+ # Text Encoder using Compel
102
+ self.__compel_proc = Compel(tokenizer=self.__sd_model.tokenizer, text_encoder=self.__sd_model.text_encoder, truncate_long_prompts=False)
103
+
104
+ output_dir = Path('.') / 'outputs'
105
+ if not output_dir.exists():
106
+ output_dir.mkdir(parents=True, exist_ok=True)
107
+ elif output_dir.is_file():
108
+ assert False, f"A file with the same name as the desired directory ('{str(output_dir)}') already exists."
109
+
110
+
111
+ def text2image(self,
112
+ prompt: str, neg_prompt: str = None,
113
+ ratio: Literal['3:2', '4:3', '16:9', '1:1', '9:16', '3:4', '2:3'] = '1:1',
114
+ step: int = 28,
115
+ cfg: float = 4.5,
116
+ seed: int = None) -> str:
117
+ """Generate an image from the prompt.
118
+
119
+ Args:
120
+ prompt (str): Prompt for the image generation.
121
+ neg_prompt (str, optional): Negative prompt for the image generation. Defaults to None.
122
+ ratio (Literal['3:2', '4:3', '16:9', '1:1', '9:16', '3:4', '2:3'], optional): Ratio of the generated image. Defaults to '1:1'.
123
+ step (int, optional): Number of iterations for the diffusion. Defaults to 20.
124
+ cfg (float, optional): Configuration for the diffusion. Defaults to 7.5.
125
+ seed (int, optional): Seed for the random number generator. Defaults to None.
126
+
127
+ Returns:
128
+ str: Path to the generated image.
129
+ """
130
+
131
+ output_filename = Path('.') / 'outputs' / str(uuid.uuid4())
132
+
133
+ if not seed or seed == -1:
134
+ seed = torch.randint(0, 2**32 - 1, (1,)).item()
135
+ set_all_seeds(seed)
136
+
137
+ width, height = self.__ratio[ratio]
138
+
139
+ prompt_embeds, negative_prompt_embeds = self.__get_pipeline_embeds(prompt, neg_prompt or self.neg_prompt)
140
+
141
+ # Generate the image
142
+ result = self.__sd_model(prompt_embeds=prompt_embeds,
143
+ negative_prompt_embeds=negative_prompt_embeds,
144
+ guidance_scale=cfg,
145
+ num_inference_steps=step,
146
+ width=width,
147
+ height=height,
148
+ )
149
+ if self.__safety and result.nsfw_content_detected[0]:
150
+ print("=== NSFW Content Detected ===")
151
+ raise ValueError("Potential NSFW content was detected in one or more images.")
152
+
153
+ img = result.images[0]
154
+ img.save(str(output_filename.with_suffix('.png')))
155
+
156
+ return str(output_filename.with_suffix('.png'))
157
+
158
+
159
+ def generate_character_prompts(self, character_name: str, age: str, job: str,
160
+ keywords: list[str] = None,
161
+ creative_mode: Literal['sd character', 'cartoon', 'realistic'] = 'cartoon') -> tuple[str, str]:
162
+ """Generate positive and negative prompts for a character based on given attributes.
163
+
164
+ Args:
165
+ character_name (str): Character's name.
166
+ age (str): Age of the character.
167
+ job (str): The profession or job of the character.
168
+ keywords (list[str]): List of descriptive words for the character.
169
+
170
+ Returns:
171
+ tuple[str, str]: A tuple of positive and negative prompts.
172
+ """
173
+
174
+ positive = "" # add static prompt for character if needed (e.g. "chibi, cute, anime")
175
+ negative = palm_prompts['image_gen']['neg_prompt']
176
+
177
+ # Generate prompts with PaLM
178
+ t = palm_prompts['image_gen']['character']['gen_prompt']
179
+ q = palm_prompts['image_gen']['character']['query']
180
+ query_string = t.format(input=q.format(character_name=character_name,
181
+ job=job,
182
+ age=age,
183
+ keywords=', '.join(keywords) if keywords else 'Nothing'))
184
+ try:
185
+ response, response_txt = asyncio.run(asyncio.wait_for(
186
+ gen_text(query_string, mode="text", use_filter=False),
187
+ timeout=10)
188
+ )
189
+ except asyncio.TimeoutError:
190
+ raise TimeoutError("The response time for PaLM API exceeded the limit.")
191
+
192
+ try:
193
+ res_json = json.loads(response_txt)
194
+ positive = (res_json['primary_sentence'] if not positive else f"{positive}, {res_json['primary_sentence']}") + ", "
195
+ gender_keywords = ['1man', '1woman', '1boy', '1girl', '1male', '1female', '1gentleman', '1lady']
196
+ positive += ', '.join([w if w not in gender_keywords else w + '+++' for w in res_json['descriptors']])
197
+ positive = f'{job.lower()}+'.join(positive.split(job.lower()))
198
+ except:
199
+ print("=== PaLM Response ===")
200
+ print(response.filters)
201
+ print(response_txt)
202
+ print("=== PaLM Response ===")
203
+ raise ValueError("The response from PaLM API is not in the expected format.")
204
+
205
+ return (positive.lower(), negative.lower())
206
+
207
+
208
+ def generate_background_prompts(self, genre:str, place:str, mood:str,
209
+ title:str, chapter_title:str, chapter_plot:str) -> tuple[str, str]:
210
+ """Generate positive and negative prompts for a background image based on given attributes.
211
+
212
+ Args:
213
+ genre (str): Genre of the story.
214
+ place (str): Place of the story.
215
+ mood (str): Mood of the story.
216
+ title (str): Title of the story.
217
+ chapter_title (str): Title of the chapter.
218
+ chapter_plot (str): Plot of the chapter.
219
+
220
+ Returns:
221
+ tuple[str, str]: A tuple of positive and negative prompts.
222
+ """
223
+
224
+ positive = "painting+++, anime+, catoon, watercolor, wallpaper, text---" # add static prompt for background if needed (e.g. "chibi, cute, anime")
225
+ negative = "realistic, human, character, people, photograph, 3d render, blurry, grayscale, oversaturated, " + palm_prompts['image_gen']['neg_prompt']
226
+
227
+ # Generate prompts with PaLM
228
+ t = palm_prompts['image_gen']['background']['gen_prompt']
229
+ q = palm_prompts['image_gen']['background']['query']
230
+ query_string = t.format(input=q.format(genre=genre,
231
+ place=place,
232
+ mood=mood,
233
+ title=title,
234
+ chapter_title=chapter_title,
235
+ chapter_plot=chapter_plot))
236
+ try:
237
+ response, response_txt = asyncio.run(asyncio.wait_for(
238
+ gen_text(query_string, mode="text", use_filter=False),
239
+ timeout=10)
240
+ )
241
+ except asyncio.TimeoutError:
242
+ raise TimeoutError("The response time for PaLM API exceeded the limit.")
243
+
244
+ try:
245
+ res_json = json.loads(response_txt)
246
+ positive = (res_json['main_sentence'] if not positive else f"{positive}, {res_json['main_sentence']}") + ", "
247
+ positive += ', '.join(res_json['descriptors'])
248
+ except:
249
+ print("=== PaLM Response ===")
250
+ print(response.filters)
251
+ print(response_txt)
252
+ print("=== PaLM Response ===")
253
+ raise ValueError("The response from PaLM API is not in the expected format.")
254
+
255
+ return (positive.lower(), negative.lower())
256
+
257
+
258
+ def __get_pipeline_embeds(self, prompt:str, negative_prompt:str) -> tuple[torch.Tensor, torch.Tensor]:
259
+ """
260
+ Get pipeline embeds for prompts bigger than the maxlength of the pipeline
261
+
262
+ Args:
263
+ prompt (str): Prompt for the image generation.
264
+ neg_prompt (str): Negative prompt for the image generation.
265
+
266
+ Returns:
267
+ tuple[torch.Tensor, torch.Tensor]: A tuple of positive and negative prompt embeds.
268
+ """
269
+ conditioning = self.__compel_proc.build_conditioning_tensor(prompt)
270
+ negative_conditioning = self.__compel_proc.build_conditioning_tensor(negative_prompt)
271
+ return self.__compel_proc.pad_conditioning_tensors_to_same_length([conditioning, negative_conditioning])
272
+
273
+
274
+ @property
275
+ def model_base(self):
276
+ """Model base
277
+
278
+ Returns:
279
+ str: The model base (read-only)
280
+ """
281
+ return self.__model_base
282
+
283
+ @property
284
+ def clip_skip(self):
285
+ """Clip Skip
286
+
287
+ Returns:
288
+ int: The number of layers to skip in the clip model (read-only)
289
+ """
290
+ return self.__clip_skip
291
+
292
+ @property
293
+ def sampling(self):
294
+ """Sampling method
295
+
296
+ Returns:
297
+ Literal['sde-dpmsolver++']: The sampling method (read-only)
298
+ """
299
+ return self.__sampling
300
+
301
+ @property
302
+ def vae(self):
303
+ """VAE
304
+
305
+ Returns:
306
+ str: The VAE (read-only)
307
+ """
308
+ return self.__vae
309
+
310
+ @property
311
+ def safety(self):
312
+ """Safety checker
313
+
314
+ Returns:
315
+ bool: Whether to use the safety checker (read-only)
316
+ """
317
+ return self.__safety
318
+
319
+ @property
320
+ def device(self):
321
+ """Device
322
+
323
+ Returns:
324
+ str: The device (read-only)
325
+ """
326
+ return self.__device
327
+
328
+ @device.setter
329
+ def device(self, value):
330
+ if self.__allocated:
331
+ raise RuntimeError("Cannot change device after the model is loaded.")
332
+
333
+ if value == 'cpu':
334
+ self.__device = value
335
+ else:
336
+ global _gpus
337
+ self.__device = f'{value}:{_gpus}'
338
+ max_gpu = torch.cuda.device_count()
339
+ _gpus = (_gpus + 1) if (_gpus + 1) < max_gpu else 0
340
+ self.__allocated = True
341
+
342
+ @property
343
+ def neg_prompt(self):
344
+ """Negative prompt
345
+
346
+ Returns:
347
+ str: The negative prompt
348
+ """
349
+ return self.__neg_prompt
350
+
351
+ @neg_prompt.setter
352
+ def neg_prompt(self, value):
353
+ if not value:
354
+ self.__neg_prompt = ""
355
+ else:
356
+ self.__neg_prompt = value
modules/music_maker.py ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from typing import Literal
2
+ from tempfile import NamedTemporaryFile
3
+ from pathlib import Path
4
+
5
+ import uuid
6
+ import shutil
7
+ import json
8
+ import asyncio
9
+ import toml
10
+
11
+ import torch
12
+
13
+ from audiocraft.models import MusicGen
14
+ from audiocraft.data.audio import audio_write
15
+ from pydub import AudioSegment
16
+
17
+ from .utils import (
18
+ set_all_seeds,
19
+ )
20
+ from .palmchat import (
21
+ palm_prompts,
22
+ gen_text,
23
+ )
24
+
25
+ class MusicMaker:
26
+ # TODO: DocString...
27
+ """Class for generating music from prompts."""
28
+
29
+ def __init__(self, model_size: Literal['small', 'medium', 'melody', 'large'] = 'large',
30
+ output_format: Literal['wav', 'mp3'] = 'mp3',
31
+ device: str = None) -> None:
32
+ """Initialize the MusicMaker class.
33
+
34
+ Args:
35
+ model_size (Literal['small', 'medium', 'melody', 'large'], optional): Model size. Defaults to 'large'.
36
+ output_format (Literal['wav', 'mp3'], optional): Output format. Defaults to 'mp3'.
37
+ device (str, optional): Device to use for the model. Defaults to None.
38
+ """
39
+
40
+ self.__model_size = model_size
41
+ self.__output_format = output_format
42
+ self.__device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') if not device else device
43
+
44
+ print("Loading the MusicGen model into memory...")
45
+ self.__mg_model = MusicGen.get_pretrained(self.model_size, device=self.device)
46
+ self.__mg_model.set_generation_params(use_sampling=True,
47
+ top_k=250,
48
+ top_p=0.0,
49
+ temperature=1.0,
50
+ cfg_coef=3.0
51
+ )
52
+
53
+ output_dir = Path('.') / 'outputs'
54
+ if not output_dir.exists():
55
+ output_dir.mkdir(parents=True, exist_ok=True)
56
+ elif output_dir.is_file():
57
+ assert False, f"A file with the same name as the desired directory ('{str(output_dir)}') already exists."
58
+
59
+
60
+ def text2music(self, prompt: str, length: int = 60, seed: int = None) -> str:
61
+ """Generate a music from the prompt.
62
+
63
+ Args:
64
+ prompt (str): Prompt to generate the music from.
65
+ length (int, optional): Length of the music in seconds. Defaults to 60.
66
+ seed (int, optional): Seed to use for the generation. Defaults to None.
67
+
68
+ Returns:
69
+ str: Path to the generated music.
70
+ """
71
+
72
+ def wavToMp3(src_file: str, dest_file: str) -> None:
73
+ sound = AudioSegment.from_wav(src_file)
74
+ sound.export(dest_file, format="mp3")
75
+
76
+ output_filename = Path('.') / 'outputs' / str(uuid.uuid4())
77
+
78
+ if not seed or seed == -1:
79
+ seed = torch.randint(0, 2**32 - 1, (1,)).item()
80
+ set_all_seeds(seed)
81
+
82
+ self.__mg_model.set_generation_params(duration=length)
83
+ output = self.__mg_model.generate(descriptions=[prompt], progress=True)[0]
84
+
85
+ with NamedTemporaryFile("wb", delete=True) as temp_file:
86
+ audio_write(temp_file.name, output.cpu(), self.__mg_model.sample_rate, strategy="loudness", loudness_compressor=True)
87
+ if self.output_format == 'mp3':
88
+ wavToMp3(f'{temp_file.name}.wav', str(output_filename.with_suffix('.mp3')))
89
+ else:
90
+ shutil.copy(f'{temp_file.name}.wav', str(output_filename.with_suffix('.wav')))
91
+
92
+ return str(output_filename.with_suffix('.mp3' if self.output_format == 'mp3' else '.wav'))
93
+
94
+
95
+ def generate_prompt(self, genre:str, place:str, mood:str,
96
+ title:str, chapter_title:str, chapter_plot:str) -> str:
97
+ """Generate a prompt for a background music based on given attributes.
98
+
99
+ Args:
100
+ genre (str): Genre of the story.
101
+ place (str): Place of the story.
102
+ mood (str): Mood of the story.
103
+ title (str): Title of the story.
104
+ chapter_title (str): Title of the chapter.
105
+ chapter_plot (str): Plot of the chapter.
106
+
107
+ Returns:
108
+ str: Generated prompt.
109
+ """
110
+
111
+ # Generate prompts with PaLM
112
+ t = palm_prompts['music_gen']['gen_prompt']
113
+ q = palm_prompts['music_gen']['query']
114
+ query_string = t.format(input=q.format(genre=genre,
115
+ place=place,
116
+ mood=mood,
117
+ title=title,
118
+ chapter_title=chapter_title,
119
+ chapter_plot=chapter_plot))
120
+ try:
121
+ response, response_txt = asyncio.run(asyncio.wait_for(
122
+ gen_text(query_string, mode="text", use_filter=False),
123
+ timeout=10)
124
+ )
125
+ except asyncio.TimeoutError:
126
+ raise TimeoutError("The response time for PaLM API exceeded the limit.")
127
+
128
+ try:
129
+ res_json = json.loads(response_txt)
130
+ except:
131
+ print("=== PaLM Response ===")
132
+ print(response.filters)
133
+ print(response_txt)
134
+ print("=== PaLM Response ===")
135
+ raise ValueError("The response from PaLM API is not in the expected format.")
136
+
137
+ return res_json['main_sentence']
138
+
139
+
140
+ @property
141
+ def model_size(self):
142
+ """Model size
143
+
144
+ Returns:
145
+ Literal['small', 'medium', 'melody', 'large']: The model size (read-only)
146
+ """
147
+ return self.__model_size
148
+
149
+ @property
150
+ def output_format(self):
151
+ """Output format
152
+
153
+ Returns:
154
+ Literal['wav', 'mp3']: The output format (read-only)
155
+ """
156
+ return self.__output_format
157
+
158
+ @property
159
+ def device(self):
160
+ """Device
161
+
162
+ Returns:
163
+ str: The device (read-only)
164
+ """
165
+ return self.__device
modules/palmchat.py ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import toml
3
+ from pathlib import Path
4
+ import google.generativeai as palm_api
5
+
6
+ from pingpong import PingPong
7
+ from pingpong.pingpong import PPManager
8
+ from pingpong.pingpong import PromptFmt
9
+ from pingpong.pingpong import UIFmt
10
+ from pingpong.gradio import GradioChatUIFmt
11
+
12
+ from .utils import set_palm_api_key
13
+
14
+
15
+ # Set PaLM API Key
16
+ set_palm_api_key()
17
+
18
+ # Load PaLM Prompt Templates
19
+ palm_prompts = toml.load(Path('.') / 'assets' / 'palm_prompts.toml')
20
+
21
+ class PaLMChatPromptFmt(PromptFmt):
22
+ @classmethod
23
+ def ctx(cls, context):
24
+ pass
25
+
26
+ @classmethod
27
+ def prompt(cls, pingpong, truncate_size):
28
+ ping = pingpong.ping[:truncate_size]
29
+ pong = pingpong.pong
30
+
31
+ if pong is None or pong.strip() == "":
32
+ return [
33
+ {
34
+ "author": "USER",
35
+ "content": ping
36
+ },
37
+ ]
38
+ else:
39
+ pong = pong[:truncate_size]
40
+
41
+ return [
42
+ {
43
+ "author": "USER",
44
+ "content": ping
45
+ },
46
+ {
47
+ "author": "AI",
48
+ "content": pong
49
+ },
50
+ ]
51
+
52
+ class PaLMChatPPManager(PPManager):
53
+ def build_prompts(self, from_idx: int=0, to_idx: int=-1, fmt: PromptFmt=PaLMChatPromptFmt, truncate_size: int=None):
54
+ results = []
55
+
56
+ if to_idx == -1 or to_idx >= len(self.pingpongs):
57
+ to_idx = len(self.pingpongs)
58
+
59
+ for idx, pingpong in enumerate(self.pingpongs[from_idx:to_idx]):
60
+ results += fmt.prompt(pingpong, truncate_size=truncate_size)
61
+
62
+ return results
63
+
64
+ class GradioPaLMChatPPManager(PaLMChatPPManager):
65
+ def build_uis(self, from_idx: int=0, to_idx: int=-1, fmt: UIFmt=GradioChatUIFmt):
66
+ if to_idx == -1 or to_idx >= len(self.pingpongs):
67
+ to_idx = len(self.pingpongs)
68
+
69
+ results = []
70
+
71
+ for pingpong in self.pingpongs[from_idx:to_idx]:
72
+ results.append(fmt.ui(pingpong))
73
+
74
+ return results
75
+
76
+ async def gen_text(
77
+ prompt,
78
+ mode="chat", #chat or text
79
+ parameters=None,
80
+ use_filter=True
81
+ ):
82
+ if parameters is None:
83
+ temperature = 1.0
84
+ top_k = 40
85
+ top_p = 0.95
86
+ max_output_tokens = 1024
87
+
88
+ # default safety settings
89
+ safety_settings = [{"category":"HARM_CATEGORY_DEROGATORY","threshold":1},
90
+ {"category":"HARM_CATEGORY_TOXICITY","threshold":1},
91
+ {"category":"HARM_CATEGORY_VIOLENCE","threshold":2},
92
+ {"category":"HARM_CATEGORY_SEXUAL","threshold":2},
93
+ {"category":"HARM_CATEGORY_MEDICAL","threshold":2},
94
+ {"category":"HARM_CATEGORY_DANGEROUS","threshold":2}]
95
+ if not use_filter:
96
+ for idx, _ in enumerate(safety_settings):
97
+ safety_settings[idx]['threshold'] = 4
98
+
99
+ if mode == "chat":
100
+ parameters = {
101
+ 'model': 'models/chat-bison-001',
102
+ 'candidate_count': 1,
103
+ 'context': "",
104
+ 'temperature': temperature,
105
+ 'top_k': top_k,
106
+ 'top_p': top_p,
107
+ }
108
+ else:
109
+ parameters = {
110
+ 'model': 'models/text-bison-001',
111
+ 'candidate_count': 1,
112
+ 'temperature': temperature,
113
+ 'top_k': top_k,
114
+ 'top_p': top_p,
115
+ 'max_output_tokens': max_output_tokens,
116
+ 'safety_settings': safety_settings,
117
+ }
118
+
119
+ if mode == "chat":
120
+ response = await palm_api.chat_async(**parameters, messages=prompt)
121
+ else:
122
+ response = palm_api.generate_text(**parameters, prompt=prompt)
123
+
124
+ if use_filter and len(response.filters) > 0 and \
125
+ response.filters[0]['reason'] == 2:
126
+ response_txt = "your request is blocked for some reasons"
127
+ else:
128
+ if mode == "chat":
129
+ response_txt = response.last
130
+ else:
131
+ response_txt = response.result
132
+
133
+ return response, response_txt
modules/utils.py ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import numpy as np
3
+ import random
4
+ import uuid
5
+
6
+ from pathlib import Path
7
+ from tempfile import NamedTemporaryFile
8
+
9
+ from PIL import Image
10
+ from PIL import ImageDraw
11
+ from PIL import ImageFont
12
+
13
+ import torch
14
+
15
+ import google.generativeai as palm_api
16
+
17
+ def set_all_seeds(random_seed: int) -> None:
18
+ # TODO: DocString...
19
+ torch.manual_seed(random_seed)
20
+ torch.cuda.manual_seed(random_seed)
21
+ torch.cuda.manual_seed_all(random_seed)
22
+ torch.backends.cudnn.deterministic = True
23
+ torch.backends.cudnn.benchmark = False
24
+ np.random.seed(random_seed)
25
+ random.seed(random_seed)
26
+ print(f"Using seed {random_seed}")
27
+
28
+
29
+ def get_palm_api_key() -> str:
30
+ palm_api_key = os.getenv("PALM_API_KEY")
31
+
32
+ if palm_api_key is None:
33
+ with open('.palm_api_key.txt', 'r') as file:
34
+ palm_api_key = file.read().strip()
35
+
36
+ if not palm_api_key:
37
+ raise ValueError("PaLM API Key is missing.")
38
+ return palm_api_key
39
+
40
+
41
+ def set_palm_api_key(palm_api_key:str = None) -> None:
42
+ palm_api.configure(api_key=(palm_api_key or get_palm_api_key()))
43
+
44
+
45
+ def merge_video(image_path: str, audio_path: str, story_title:str = None) -> str:
46
+ output_filename = Path('.') / 'outputs' / str(uuid.uuid4())
47
+ output_filename = str(output_filename.with_suffix('.mp4'))
48
+
49
+ try:
50
+ temp_image_path = image_path
51
+ if story_title:
52
+ img = Image.open(image_path)
53
+ img_drawable = ImageDraw.Draw(img)
54
+ title_font_path = str(Path('.') / 'assets' / 'Lugrasimo-Regular.ttf')
55
+ title_font = ImageFont.truetype(title_font_path, 24)
56
+ img_drawable.text((65, 468), story_title, font=title_font, fill=(16, 16, 16))
57
+ img_drawable.text((63, 466), story_title, font=title_font, fill=(255, 255, 255))
58
+
59
+ with NamedTemporaryFile("wb", delete=True) as temp_file:
60
+ temp_image_path = f'{temp_file.name}.png'
61
+ img.save(temp_image_path)
62
+
63
+ cmd = [
64
+ 'ffmpeg', '-loop', '1', '-i', temp_image_path, '-i', audio_path,
65
+ '-filter_complex',
66
+ '"[1:a]asplit=29[ASPLIT01][ASPLIT02][ASPLIT03][ASPLIT04][ASPLIT05][ASPLIT06][ASPLIT07][ASPLIT08][ASPLIT09][ASPLIT10][ASPLIT11][ASPLIT12][ASPLIT13][ASPLIT14][ASPLIT15][ASPLIT16][ASPLIT17][ASPLIT18][ASPLIT19][ASPLIT20][ASPLIT21][ASPLIT22][ASPLIT23][ASPLIT24][ASPLIT25][ASPLIT26][ASPLIT27][ASPLIT28][ASPLIT29];\
67
+ [ASPLIT01]bandpass=frequency=20:width=4:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ01];\
68
+ [ASPLIT02]bandpass=frequency=25:width=4:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ02];\
69
+ [ASPLIT03]bandpass=frequency=31.5:width=8:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ03];\
70
+ [ASPLIT04]bandpass=frequency=40:width=8:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ04];\
71
+ [ASPLIT05]bandpass=frequency=50:width=8:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ05];\
72
+ [ASPLIT06]bandpass=frequency=63:width=8:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ06];\
73
+ [ASPLIT07]bandpass=frequency=80:width=16:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ07];\
74
+ [ASPLIT08]bandpass=frequency=100:width=16:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ08];\
75
+ [ASPLIT09]bandpass=frequency=125:width=32:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ09];\
76
+ [ASPLIT10]bandpass=frequency=160:width=32:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ10];\
77
+ [ASPLIT11]bandpass=frequency=200:width=64:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ11];\
78
+ [ASPLIT12]bandpass=frequency=250:width=64:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ12];\
79
+ [ASPLIT13]bandpass=frequency=315:width=64:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ13];\
80
+ [ASPLIT14]bandpass=frequency=400:width=64:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ14];\
81
+ [ASPLIT15]bandpass=frequency=500:width=128:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ15];\
82
+ [ASPLIT16]bandpass=frequency=630:width=128:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ16];\
83
+ [ASPLIT17]bandpass=frequency=800:width=128:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ17];\
84
+ [ASPLIT18]bandpass=frequency=1000:width=128:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ18];\
85
+ [ASPLIT19]bandpass=frequency=1250:width=256:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ19];\
86
+ [ASPLIT20]bandpass=frequency=1500:width=256:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ20];\
87
+ [ASPLIT21]bandpass=frequency=2000:width=512:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ21];\
88
+ [ASPLIT22]bandpass=frequency=2500:width=1024:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ22];\
89
+ [ASPLIT23]bandpass=frequency=3150:width=1024:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ23];\
90
+ [ASPLIT24]bandpass=frequency=4000:width=1024:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ24];\
91
+ [ASPLIT25]bandpass=frequency=5000:width=1024:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ25];\
92
+ [ASPLIT26]bandpass=frequency=6300:width=1024:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ26];\
93
+ [ASPLIT27]bandpass=frequency=8000:width=1024:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ27];\
94
+ [ASPLIT28]bandpass=frequency=12000:width=1024:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ28];\
95
+ [ASPLIT29]bandpass=frequency=16000:width=2048:width_type=h,showvolume=rate=30.000:c=0xAFFFFFFF:b=5:w=176:h=11:o=v:t=0:v=0:m=p:s=0:ds=lin:dm=1:dmc=0xFFFFFFFF[EQ29];\
96
+ [EQ01][EQ02][EQ03][EQ04][EQ05][EQ06][EQ07][EQ08][EQ09][EQ10][EQ11][EQ12][EQ13][EQ14][EQ15][EQ16][EQ17][EQ18][EQ19][EQ20][EQ21][EQ22][EQ23][EQ24][EQ25][EQ26][EQ27][EQ28][EQ29]hstack=inputs=29[BARS];[0][BARS]overlay=(W-w)/2:H-h-50:shortest=1,format=yuv420p[out]"',
97
+ '-map', '"[out]"', '-map', '1:a', '-movflags', '+faststart',
98
+ output_filename
99
+ ]
100
+
101
+ result = os.system(' '.join([c.strip() for c in cmd]))
102
+
103
+ if result == 0:
104
+ return output_filename
105
+ else:
106
+ return None
107
+ except Exception as e:
108
+ print(e)
109
+ return None
pyproject.toml ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [tool.poetry]
2
+ name = "zero2story"
3
+ version = "0.1.0"
4
+ description = ""
5
+ authors = ["Sangjoon Han <[email protected]>"]
6
+ readme = "README.md"
7
+
8
+ [tool.poetry.dependencies]
9
+ python = ">=3.10,<3.13"
10
+ gradio = "^3.42.0"
11
+ torch = {version = "^2.0.1+cu118", source = "pytorch"}
12
+ torchvision = {version = "^0.15.2+cu118", source = "pytorch"}
13
+ torchaudio = {version = "^2.0.2+cu118", source = "pytorch"}
14
+ transformers = "^4.33.1"
15
+ scipy = "^1.11.2"
16
+ diffusers = "^0.20.2"
17
+ numpy = ">=1.21,<1.25"
18
+ numba = "^0.57.1"
19
+ audiocraft = "^0.0.2"
20
+ accelerate = "^0.22.0"
21
+ google-generativeai = "^0.1.0"
22
+ bingbong = "^0.4.2"
23
+ asyncio = "^3.4.3"
24
+ toml = "^0.10.2"
25
+ compel = "^2.0.2"
26
+
27
+ [[tool.poetry.source]]
28
+ name = "pytorch"
29
+ url = "https://download.pytorch.org/whl/cu118"
30
+ priority = "explicit"
31
+
32
+ [tool.poetry.group.dev.dependencies]
33
+
34
+ [build-system]
35
+ requires = ["poetry-core"]
36
+ build-backend = "poetry.core.masonry.api"
run.sh ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+
3
+ PID=./gradio.pid
4
+ if [[ -f "$PID" ]]; then
5
+ kill -15 `cat $PID` || kill -9 `cat $PID`
6
+ fi
7
+
8
+ mkdir -p ./logs
9
+ rm -rf ./logs/app.log
10
+
11
+ CONFIDENTIAL=./.palm_api_key.txt
12
+ if [[ ! -f "$CONFIDENTIAL" ]]; then
13
+ echo "Error: PaLM API file not found. To continue, please create a .palm_api_key.txt file in the current directory."
14
+ exit 1
15
+ fi
16
+
17
+ export PALM_API_KEY=`cat .palm_api_key.txt`
18
+ nohup python -u app.py > ./logs/app.log 2>&1 &
19
+ echo $! > $PID