Carlexx commited on
Commit
46a5dbb
Β·
verified Β·
1 Parent(s): 53145fb

Upload 11 files

Browse files
NOTICE.md ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # NOTICE
2
+
3
+ Copyright (C) 2025 Carlos Rodrigues dos Santos. All rights reserved.
4
+
5
+ ---
6
+
7
+ ## Aviso de Propriedade Intelectual e Licenciamento
8
+
9
+ ### **Processo de Patenteamento em Andamento (EM PORTUGUÊS):**
10
+
11
+ O mΓ©todo e o sistema de orquestraΓ§Γ£o de prompts denominados **ADUC (Automated Discovery and Orchestration of Complex tasks)**, conforme descritos neste documento e implementados neste software, estΓ£o atualmente em processo de patenteamento.
12
+
13
+ O titular dos direitos, Carlos Rodrigues dos Santos, estΓ‘ buscando proteΓ§Γ£o legal para as inovaΓ§Γ΅es chave da arquitetura ADUC, incluindo, mas nΓ£o se limitando a:
14
+
15
+ * FragmentaΓ§Γ£o e escalonamento de solicitaΓ§Γ΅es que excedem limites de contexto de modelos de IA.
16
+ * DistribuiΓ§Γ£o inteligente de sub-tarefas para especialistas heterogΓͺneos.
17
+ * Gerenciamento de estado persistido com avaliaΓ§Γ£o iterativa e realimentaΓ§Γ£o para o planejamento de prΓ³ximas etapas.
18
+ * Planejamento e roteamento sensΓ­vel a custo, latΓͺncia e requisitos de qualidade.
19
+ * O uso de "tokens universais" para comunicaΓ§Γ£o agnΓ³stica a modelos.
20
+
21
+ ### **Reconhecimento e Implicaçáes (EM PORTUGUÊS):**
22
+
23
+ Ao acessar ou utilizar este software e a arquitetura ADUC aqui implementada, vocΓͺ reconhece:
24
+
25
+ 1. A natureza inovadora e a importΓ’ncia da arquitetura ADUC no campo da orquestraΓ§Γ£o de prompts para IA.
26
+ 2. Que a essΓͺncia desta arquitetura, ou suas implementaΓ§Γ΅es derivadas, podem estar sujeitas a direitos de propriedade intelectual, incluindo patentes.
27
+ 3. Que o uso comercial, a reproduΓ§Γ£o da lΓ³gica central da ADUC em sistemas independentes, ou a exploraΓ§Γ£o direta da invenΓ§Γ£o sem o devido licenciamento podem infringir os direitos de patente pendente.
28
+
29
+ ---
30
+
31
+ ### **Patent Pending (IN ENGLISH):**
32
+
33
+ The method and system for prompt orchestration named **ADUC (Automated Discovery and Orchestration of Complex tasks)**, as described herein and implemented in this software, are currently in the process of being patented.
34
+
35
+ The rights holder, Carlos Rodrigues dos Santos, is seeking legal protection for the key innovations of the ADUC architecture, including, but not limited to:
36
+
37
+ * Fragmentation and scaling of requests exceeding AI model context limits.
38
+ * Intelligent distribution of sub-tasks to heterogeneous specialists.
39
+ * Persistent state management with iterative evaluation and feedback for planning subsequent steps.
40
+ * Cost, latency, and quality-aware planning and routing.
41
+ * The use of "universal tokens" for model-agnostic communication.
42
+
43
+ ### **Acknowledgement and Implications (IN ENGLISH):**
44
+
45
+ By accessing or using this software and the ADUC architecture implemented herein, you acknowledge:
46
+
47
+ 1. The innovative nature and significance of the ADUC architecture in the field of AI prompt orchestration.
48
+ 2. That the essence of this architecture, or its derivative implementations, may be subject to intellectual property rights, including patents.
49
+ 3. That commercial use, reproduction of ADUC's core logic in independent systems, or direct exploitation of the invention without proper licensing may infringe upon pending patent rights.
50
+
51
+ ---
52
+
53
+ ## LicenΓ§a AGPLv3
54
+
55
+ This program is free software: you can redistribute it and/or modify
56
+ it under the terms of the GNU Affero General Public License as published by
57
+ the Free Software Foundation, either version 3 of the License, or
58
+ (at your option) any later version.
59
+
60
+ This program is distributed in the hope that it will be useful,
61
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
62
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
63
+ GNU Affero General Public License for more details.
64
+
65
+ You should have received a copy of the GNU Affero General Public License
66
+ along with this program. If not, see <https://www.gnu.org/licenses/>.
67
+
68
+ ---
69
+
70
+ **Contato para Consultas:**
71
+
72
+ Para mais informaΓ§Γ΅es sobre a arquitetura ADUC, o status do patenteamento, ou para discutir licenciamento para usos comerciais ou nΓ£o conformes com a AGPLv3, por favor, entre em contato:
73
+
74
+ Carlos Rodrigues dos Santos
75
76
+ Rua Eduardo Carlos Pereira, 4125, B1 Ap32, Curitiba, PR, Brazil, CEP 8102025
README.md CHANGED
@@ -1,12 +1,14 @@
1
  ---
2
  title: Euia-AducSdr
3
- emoji: 🎬
4
  colorFrom: indigo
5
  colorTo: purple
6
  sdk: gradio
7
  sdk_version: 5.42.0
8
  app_file: app.py
9
- pinned: false
 
 
10
  ---
11
 
12
  ### πŸ‡§πŸ‡· PortuguΓͺs
@@ -31,14 +33,169 @@ An open and functional implementation of the ADUC-SDR (Architecture for Composit
31
 
32
  ---
33
 
34
- ### πŸ‡ͺπŸ‡Έ EspaΓ±ol
35
 
36
- Una implementaciΓ³n abierta y funcional de la arquitectura ADUC-SDR (Arquitectura de UnificaciΓ³n Compositiva - Escala DinΓ‘mica y Resiliente), diseΓ±ada para la generaciΓ³n de video coherente de larga duraciΓ³n. Este proyecto materializa los principios de fragmentaciΓ³n, navegaciΓ³n geomΓ©trica y un mecanismo de "eco causal 4bits memoria" para garantizar la continuidad fΓ­sica y narrativa en secuencias de video generadas por mΓΊltiples modelos de IA.
37
 
38
- **Licencia:** Este proyecto estΓ‘ licenciado bajo los tΓ©rminos de la **Licencia PΓΊblica General Affero de GNU v3.0**. Esto significa que si usted utiliza este software (o cualquier obra derivada) para proporcionar un servicio a travΓ©s de una red, estΓ‘ **obligado a ofrecer el cΓ³digo fuente completo** de su versiΓ³n a los usuarios de dicho servicio.
39
 
40
- - **Copyright (C) 4 de Agosto de 2025, Carlos Rodrigues dos Santos**
41
- - Puede encontrar una copia completa de la licencia en el archivo [LICENSE](LICENSE).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
  ---
44
 
@@ -49,4 +206,6 @@ Una implementaciΓ³n abierta y funcional de la arquitectura ADUC-SDR (Arquitectur
49
  - **GitHub:** [https://github.com/carlex22/Aduc-sdr](https://github.com/carlex22/Aduc-sdr)
50
  - **Hugging Face Spaces:**
51
  - [Ltx-SuperTime-60Secondos](https://huggingface.co/spaces/Carlexx/Ltx-SuperTime-60Secondos/)
52
- - [Novinho](https://huggingface.co/spaces/Carlexxx/Novinho/)
 
 
 
1
  ---
2
  title: Euia-AducSdr
3
+ emoji: πŸŽ₯
4
  colorFrom: indigo
5
  colorTo: purple
6
  sdk: gradio
7
  sdk_version: 5.42.0
8
  app_file: app.py
9
+ pinned: true
10
+ license: agpl-3.0
11
+ short_description: Uma implementaΓ§Γ£o aberta e funcional da arquitetura ADUC-SDR
12
  ---
13
 
14
  ### πŸ‡§πŸ‡· PortuguΓͺs
 
33
 
34
  ---
35
 
36
+ ## **Aviso de Propriedade Intelectual e Patenteamento**
37
 
38
+ ### **Processo de Patenteamento em Andamento (EM PORTUGUÊS):**
39
 
40
+ A arquitetura e o mΓ©todo **ADUC (Automated Discovery and Orchestration of Complex tasks)**, conforme descritos neste projeto e nas reivindicaΓ§Γ΅es associadas, estΓ£o **atualmente em processo de patenteamento**.
41
 
42
+ O titular dos direitos, Carlos Rodrigues dos Santos, estΓ‘ buscando proteΓ§Γ£o legal para as inovaΓ§Γ΅es chave da arquitetura ADUC, que incluem, mas nΓ£o se limitam a:
43
+
44
+ * FragmentaΓ§Γ£o e escalonamento de solicitaΓ§Γ΅es que excedem limites de contexto de modelos de IA.
45
+ * DistribuiΓ§Γ£o inteligente de sub-tarefas para especialistas heterogΓͺneos.
46
+ * Gerenciamento de estado persistido com avaliaΓ§Γ£o iterativa e realimentaΓ§Γ£o para o planejamento de prΓ³ximas etapas.
47
+ * Planejamento e roteamento sensΓ­vel a custo, latΓͺncia e requisitos de qualidade.
48
+ * O uso de "tokens universais" para comunicaΓ§Γ£o agnΓ³stica a modelos.
49
+
50
+ Ao utilizar este software e a arquitetura ADUC aqui implementada, vocΓͺ reconhece a natureza inovadora desta arquitetura e que a **reproduΓ§Γ£o ou exploraΓ§Γ£o da lΓ³gica central da ADUC em sistemas independentes pode infringir direitos de patente pendente.**
51
+
52
+ ---
53
+
54
+ ### **Patent Pending (IN ENGLISH):**
55
+
56
+ The **ADUC (Automated Discovery and Orchestration of Complex tasks)** architecture and method, as described in this project and its associated claims, are **currently in the process of being patented.**
57
+
58
+ The rights holder, Carlos Rodrigues dos Santos, is seeking legal protection for the key innovations of the ADUC architecture, including, but not limited to:
59
+
60
+ * Fragmentation and scaling of requests exceeding AI model context limits.
61
+ * Intelligent distribution of sub-tasks to heterogeneous specialists.
62
+ * Persistent state management with iterative evaluation and feedback for planning subsequent steps.
63
+ * Cost, latency, and quality-aware planning and routing.
64
+ * The use of "universal tokens" for model-agnostic communication.
65
+
66
+ By using this software and the ADUC architecture implemented herein, you acknowledge the innovative nature of this architecture and that **the reproduction or exploitation of ADUC's core logic in independent systems may infringe upon pending patent rights.**
67
+
68
+ ---
69
+
70
+ ### Detalhes TΓ©cnicos e ReivindicaΓ§Γ΅es da ADUC
71
+
72
+ #### πŸ‡§πŸ‡· DefiniΓ§Γ£o Curta (para Tese e Patente)
73
+
74
+ **ADUC** Γ© um *framework prΓ©-input* e *intermediΓ‘rio* de **gerenciamento de prompts** que:
75
+
76
+ 1. **fragmenta** solicitaΓ§Γ΅es acima do limite de contexto de qualquer modelo,
77
+ 2. **escala linearmente** (processo sequencial com memΓ³ria persistida),
78
+ 3. **distribui** sub-tarefas a **especialistas** (modelos/ferramentas heterogΓͺneos), e
79
+ 4. **realimenta** a prΓ³xima etapa com avaliaΓ§Γ£o do que foi feito/esperado (LLM diretor).
80
+
81
+ NΓ£o Γ© um modelo; Γ© uma **camada orquestradora** plugΓ‘vel antes do input de modelos existentes (texto, imagem, Γ‘udio, vΓ­deo), usando *tokens universais* e a tecnologia atual.
82
+
83
+ #### πŸ‡¬πŸ‡§ Short Definition (for Thesis and Patent)
84
+
85
+ **ADUC** is a *pre-input* and *intermediate* **prompt management framework** that:
86
+
87
+ 1. **fragments** requests exceeding any model's context limit,
88
+ 2. **scales linearly** (sequential process with persisted memory),
89
+ 3. **distributes** sub-tasks to **specialists** (heterogeneous models/tools), and
90
+ 4. **feeds back** to the next step with an evaluation of what was done/expected (director LLM).
91
+
92
+ It is not a model; it is a pluggable **orchestration layer** before the input of existing models (text, image, audio, video), using *universal tokens* and current technology.
93
+
94
+ ---
95
+
96
+ #### πŸ‡§πŸ‡· Elementos Essenciais (TelegrΓ‘fico)
97
+
98
+ * **AgnΓ³stico a modelos:** opera com qualquer LLM/difusor/API.
99
+ * **PrΓ©-input manager:** recebe pedido do usuΓ‘rio, **divide** em blocos ≀ limite de tokens, **prioriza**, **agenda** e **roteia**.
100
+ * **MemΓ³ria persistida:** resultados/latentes/β€œeco” viram **estado compartilhado** para o prΓ³ximo bloco (nada Γ© ignorado).
101
+ * **Especialistas:** *routers* decidem quem faz o quΓͺ (ex.: β€œdescriΓ§Γ£o β†’ LLM-A”, β€œkeyframe β†’ Img-B”, β€œvΓ­deo β†’ Vid-C”).
102
+ * **Controle de qualidade:** LLM diretor compara *o que fez* Γ— *o que deveria* Γ— *o que falta* e **regenera objetivos** do prΓ³ximo fragmento.
103
+ * **Custo/latΓͺncia-aware:** planeja pela **VRAM/tempo/custo**, nΓ£o tenta β€œabraΓ§ar tudo de uma vez”.
104
+
105
+ #### πŸ‡¬πŸ‡§ Essential Elements (Telegraphic)
106
+
107
+ * **Model-agnostic:** operates with any LLM/diffuser/API.
108
+ * **Pre-input manager:** receives user request, **divides** into blocks ≀ token limit, **prioritizes**, **schedules**, and **routes**.
109
+ * **Persisted memory:** results/latents/β€œecho” become **shared state** for the next block (nothing is ignored).
110
+ * **Specialists:** *routers* decide who does what (e.g., β€œdescription β†’ LLM-A”, β€œkeyframe β†’ Img-B”, β€œvideo β†’ Vid-C”).
111
+ * **Quality control:** director LLM compares *what was done* Γ— *what should be done* Γ— *what is missing* and **regenerates objectives** for the next fragment.
112
+ * **Cost/latency-aware:** plans by **VRAM/time/cost**, does not try to β€œembrace everything at once”.
113
+
114
+ ---
115
+
116
+ #### πŸ‡§πŸ‡· ReivindicaΓ§Γ΅es Independentes (MΓ©todo e Sistema)
117
+
118
+ **ReivindicaΓ§Γ£o Independente (MΓ©todo) β€” VersΓ£o Enxuta:**
119
+
120
+ 1. **MΓ©todo** de **orquestraΓ§Γ£o de prompts** para execuΓ§Γ£o de tarefas acima do limite de contexto de modelos de IA, compreendendo:
121
+ (a) **receber** uma solicitaΓ§Γ£o que excede um limite de tokens;
122
+ (b) **analisar** a solicitaΓ§Γ£o por um **LLM diretor** e **fragmentΓ‘-la** em sub-tarefas ≀ limite;
123
+ (c) **selecionar** especialistas de execuΓ§Γ£o para cada sub-tarefa com base em capacidades declaradas;
124
+ (d) **gerar** prompts especΓ­ficos por sub-tarefa em **tokens universais**, incluindo referΓͺncias ao **estado persistido** de execuΓ§Γ΅es anteriores;
125
+ (e) **executar sequencialmente** as sub-tarefas e **persistir** suas saΓ­das como memΓ³ria (incluindo latentes/eco/artefatos);
126
+ (f) **avaliar** automaticamente a saΓ­da versus metas declaradas e **regenerar objetivos** do prΓ³ximo fragmento;
127
+ (g) **iterar** (b)–(f) atΓ© que os critΓ©rios de completude sejam atendidos, produzindo o resultado agregado;
128
+ em que o framework **escala linearmente** no tempo e armazenamento fΓ­sico, **independente** da janela de contexto dos modelos subjacentes.
129
+
130
+ **ReivindicaΓ§Γ£o Independente (Sistema):**
131
+
132
+ 2. **Sistema** de orquestraΓ§Γ£o de prompts, compreendendo: um **planejador LLM diretor**; um **roteador de especialistas**; um **banco de estado persistido** (incl. memΓ³ria cinΓ©tica para vΓ­deo); um **gerador de prompts universais**; e um **mΓ³dulo de avaliaΓ§Γ£o/realimentaΓ§Γ£o**, acoplados por uma **API prΓ©-input** a modelos heterogΓͺneos.
133
+
134
+ #### πŸ‡¬πŸ‡§ Independent Claims (Method and System)
135
+
136
+ **Independent Claim (Method) β€” Concise Version:**
137
+
138
+ 1. A **method** for **prompt orchestration** for executing tasks exceeding AI model context limits, comprising:
139
+ (a) **receiving** a request that exceeds a token limit;
140
+ (b) **analyzing** the request by a **director LLM** and **fragmenting it** into sub-tasks ≀ the limit;
141
+ (c) **selecting** execution specialists for each sub-task based on declared capabilities;
142
+ (d) **generating** specific prompts per sub-task in **universal tokens**, including references to the **persisted state** of previous executions;
143
+ (e) **sequentially executing** the sub-tasks and **persisting** their outputs as memory (including latents/echo/artifacts);
144
+ (f) **automatically evaluating** the output against declared goals and **regenerating objectives** for the next fragment;
145
+ (g) **iterating** (b)–(f) until completion criteria are met, producing the aggregated result;
146
+ wherein the framework **scales linearly** in time and physical storage, **independent** of the context window of the underlying models.
147
+
148
+ **Independent Claim (System):**
149
+
150
+ 2. A prompt orchestration **system**, comprising: a **director LLM planner**; a **specialist router**; a **persisted state bank** (incl. kinetic memory for video); a **universal prompt generator**; and an **evaluation/feedback module**, coupled via a **pre-input API** to heterogeneous models.
151
+
152
+ ---
153
+
154
+ #### πŸ‡§πŸ‡· Dependentes Úteis
155
+
156
+ * (3) Onde o roteamento considera **custo/latΓͺncia/VRAM** e metas de qualidade.
157
+ * (4) Onde o banco de estado inclui **eco cinΓ©tico** para vΓ­deo (ΓΊltimos *n* frames/latentes/fluxo).
158
+ * (5) Onde a avaliaΓ§Γ£o usa mΓ©tricas especΓ­ficas por domΓ­nio (Lflow, consistΓͺncia semΓ’ntica, etc.).
159
+ * (6) Onde *tokens universais* padronizam instruΓ§Γ΅es entre especialistas.
160
+ * (7) Onde a orquestraΓ§Γ£o decide **cut vs continuous** e **corte regenerativo** (DΓ©jΓ -Vu) ao editar vΓ­deo.
161
+ * (8) Onde o sistema **nunca descarta** conteΓΊdo excedente: **reagenda** em novos fragmentos.
162
+
163
+ #### πŸ‡¬πŸ‡§ Useful Dependents
164
+
165
+ * (3) Wherein routing considers **cost/latency/VRAM** and quality goals.
166
+ * (4) Wherein the state bank includes **kinetic echo** for video (last *n* frames/latents/flow).
167
+ * (5) Wherein evaluation uses domain-specific metrics (Lflow, semantic consistency, etc.).
168
+ * (6) Wherein *universal tokens* standardize instructions between specialists.
169
+ * (7) Wherein orchestration decides **cut vs continuous** and **regenerative cut** (DΓ©jΓ -Vu) when editing video.
170
+ * (8) Wherein the system **never discards** excess content: it **reschedules** it in new fragments.
171
+
172
+ ---
173
+
174
+ #### πŸ‡§πŸ‡· Como isso conversa com SDR (VΓ­deo)
175
+
176
+ * **Eco CinΓ©tico**: Γ© um **tipo de estado persistido** consumido pelo prΓ³ximo passo.
177
+ * **DΓ©jΓ -Vu (Corte Regenerativo)**: Γ© **uma polΓ­tica de orquestraΓ§Γ£o** aplicada quando hΓ‘ ediΓ§Γ£o; ADUC decide, monta os prompts certos e chama o especialista de vΓ­deo.
178
+ * **Cut vs Continuous**: decisΓ£o do **diretor** com base em estado + metas; ADUC roteia e garante a sobreposiΓ§Γ£o/remoΓ§Γ£o final.
179
+
180
+ #### πŸ‡¬πŸ‡§ How this Converses with SDR (Video)
181
+
182
+ * **Kinetic Echo**: is a **type of persisted state** consumed by the next step.
183
+ * **DΓ©jΓ -Vu (Regenerative Cut)**: is an **orchestration policy** applied during editing; ADUC decides, crafts the right prompts, and calls the video specialist.
184
+ * **Cut vs Continuous**: decision made by the **director** based on state + goals; ADUC routes and ensures the final overlap/removal.
185
+
186
+ ---
187
+
188
+ #### πŸ‡§πŸ‡· Mensagem Clara ao UsuΓ‘rio (ExperiΓͺncia)
189
+
190
+ > β€œSeu pedido excede o limite X do modelo Y. Em vez de truncar silenciosamente, o **ADUC** dividirΓ‘ e **entregarΓ‘ 100%** do conteΓΊdo por etapas coordenadas.”
191
+
192
+ Isso Γ© diferencial prΓ‘tico e jurΓ­dico: **nΓ£o-obviedade** por transformar limite de contexto em **pipeline controlado**, com **persistΓͺncia de estado** e **avaliaΓ§Γ£o iterativa**.
193
+
194
+ #### πŸ‡¬πŸ‡§ Clear User Message (Experience)
195
+
196
+ > "Your request exceeds model Y's limit X. Instead of silently truncating, **ADUC** will divide and **deliver 100%** of the content through coordinated steps."
197
+
198
+ This is a practical and legal differentiator: **non-obviousness** by transforming context limits into a **controlled pipeline**, with **state persistence** and **iterative evaluation**.
199
 
200
  ---
201
 
 
206
  - **GitHub:** [https://github.com/carlex22/Aduc-sdr](https://github.com/carlex22/Aduc-sdr)
207
  - **Hugging Face Spaces:**
208
  - [Ltx-SuperTime-60Secondos](https://huggingface.co/spaces/Carlexx/Ltx-SuperTime-60Secondos/)
209
+ - [Novinho](https://huggingface.co/spaces/Carlexxx/Novinho/)
210
+
211
+ ---
app.py CHANGED
@@ -4,12 +4,11 @@
4
  # Contato:
5
  # Carlos Rodrigues dos Santos
6
7
- # Rua Eduardo Carlos Pereira, 4125, B1 Ap32, Curitiba, PR, Brazil, CEP 8102025
8
  #
9
  # RepositΓ³rios e Projetos Relacionados:
10
  # GitHub: https://github.com/carlex22/Aduc-sdr
11
- # Hugging Face: https://huggingface.co/spaces/Carlexx/Ltx-SuperTime-60Secondos/
12
- # Hugging Face: https://huggingface.co/spaces/Carlexxx/Novinho/
13
  #
14
  # Este programa Γ© software livre: vocΓͺ pode redistribuΓ­-lo e/ou modificΓ‘-lo
15
  # sob os termos da LicenΓ§a PΓΊblica Geral Affero da GNU como publicada pela
@@ -24,82 +23,238 @@
24
  # VocΓͺ deve ter recebido uma cΓ³pia da LicenΓ§a PΓΊblica Geral Affero da GNU
25
  # junto com este programa. Se nΓ£o, veja <https://www.gnu.org/licenses/>.
26
 
27
- # --- app_demo.py (NOVINHO-6.2: Demo Version) ---
28
 
29
- # --- Ato 1: A ConvocaΓ§Γ£o da Orquestra (ImportaΓ§Γ΅es) ---
30
  import gradio as gr
31
  import torch
32
  import os
 
33
  import yaml
34
  from PIL import Image, ImageOps, ExifTags
35
  import shutil
36
- import gc
37
  import subprocess
38
  import google.generativeai as genai
39
  import numpy as np
40
  import imageio
41
  from pathlib import Path
42
- import huggingface_hub
43
  import json
44
  import time
45
- import spaces
46
-
47
- # --- VariΓ‘vel de Controle do Modo Demo ---
48
- # Para habilitar a funcionalidade completa, mude esta variΓ‘vel para True.
49
- # Isso requer que o Space esteja rodando em um hardware de GPU.
50
- ENABLE_MODELS = False
51
-
52
- # ImportaΓ§Γ΅es condicionais que dependem dos modelos
53
- if ENABLE_MODELS:
54
- from inference import create_ltx_video_pipeline, load_image_to_tensor_with_resize_and_crop, ConditioningItem, calculate_padding
55
- from dreamo_helpers import dreamo_generator_singleton
56
- else:
57
- # Definimos placeholders para que o resto do cΓ³digo nΓ£o falhe na importaΓ§Γ£o
58
- ConditioningItem = dict
59
-
60
- # --- Ato 2: A PreparaΓ§Γ£o do Palco (ConfiguraΓ§Γ΅es Condicionais) ---
61
- if ENABLE_MODELS:
62
- config_file_path = "configs/ltxv-13b-0.9.8-distilled.yaml"
63
- with open(config_file_path, "r") as file: PIPELINE_CONFIG_YAML = yaml.safe_load(file)
64
-
65
- LTX_REPO = "Lightricks/LTX-Video"
66
- models_dir = "downloaded_models_gradio"
67
- Path(models_dir).mkdir(parents=True, exist_ok=True)
68
-
69
- print("MODO COMPLETO ATIVADO: Carregando pipelines LTX na CPU (estado de repouso)...")
70
- distilled_model_actual_path = huggingface_hub.hf_hub_download(repo_id=LTX_REPO, filename=PIPELINE_CONFIG_YAML["checkpoint_path"], local_dir=models_dir, local_dir_use_symlinks=False)
71
- pipeline_instance = create_ltx_video_pipeline(
72
- ckpt_path=distilled_model_actual_path,
73
- precision=PIPELINE_CONFIG_YAML["precision"],
74
- text_encoder_model_name_or_path=PIPELINE_CONFIG_YAML["text_encoder_model_name_or_path"],
75
- sampler=PIPELINE_CONFIG_YAML["sampler"],
76
- device='cpu'
77
- )
78
- print("Modelos LTX prontos (na CPU).")
79
- else:
80
- # Em modo demo, definimos as variΓ‘veis dos modelos como None para evitar erros.
81
- pipeline_instance = None
82
- dreamo_generator_singleton = None
83
- PIPELINE_CONFIG_YAML = {}
84
- print("MODO DEMO ATIVADO: Carregamento de modelos pesados ignorado.")
85
 
86
  WORKSPACE_DIR = "aduc_workspace"
87
  GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY")
88
- VIDEO_FPS = 24
89
- TARGET_RESOLUTION = 420
90
-
91
 
92
- # --- Ato 3: As Partituras dos MΓΊsicos (FunΓ§Γ΅es de GeraΓ§Γ£o e AnΓ‘lise) ---
 
 
93
 
94
  def robust_json_parser(raw_text: str) -> dict:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
95
  try:
96
- start_index = raw_text.find('{'); end_index = raw_text.rfind('}')
97
  if start_index != -1 and end_index != -1 and end_index > start_index:
98
- json_str = raw_text[start_index : end_index + 1]; return json.loads(json_str)
 
99
  else: raise ValueError("Nenhum objeto JSON vΓ‘lido encontrado na resposta da IA.")
100
  except json.JSONDecodeError as e: raise ValueError(f"Falha ao decodificar JSON: {e}")
101
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
  def extract_image_exif(image_path: str) -> str:
 
 
 
 
 
 
 
 
 
103
  try:
104
  img = Image.open(image_path); exif_data = img._getexif()
105
  if not exif_data: return "No EXIF metadata found."
@@ -109,17 +264,39 @@ def extract_image_exif(image_path: str) -> str:
109
  return metadata_str if metadata_str else "No relevant EXIF metadata found."
110
  except Exception: return "Could not read EXIF data."
111
 
112
- def run_storyboard_generation(num_fragments: int, prompt: str, initial_image_path: str):
113
- if not initial_image_path: raise gr.Error("Por favor, forneΓ§a uma imagem de referΓͺncia inicial.")
114
- if not GEMINI_API_KEY: raise gr.Error("Chave da API Gemini nΓ£o configurada! Esta funΓ§Γ£o requer uma chave, mesmo em modo demo.")
115
- exif_metadata = extract_image_exif(initial_image_path)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
  prompt_file = "prompts/unified_storyboard_prompt.txt"
117
  with open(os.path.join(os.path.dirname(__file__), prompt_file), "r", encoding="utf-8") as f: template = f.read()
118
  director_prompt = template.format(user_prompt=prompt, num_fragments=int(num_fragments), image_metadata=exif_metadata)
119
  genai.configure(api_key=GEMINI_API_KEY)
120
- model = genai.GenerativeModel('gemini-1.5-flash'); img = Image.open(initial_image_path)
121
- print("Gerando roteiro com anΓ‘lise de visΓ£o integrada...")
122
- response = model.generate_content([director_prompt, img])
 
 
 
 
123
  try:
124
  storyboard_data = robust_json_parser(response.text)
125
  storyboard = storyboard_data.get("scene_storyboard", [])
@@ -127,62 +304,134 @@ def run_storyboard_generation(num_fragments: int, prompt: str, initial_image_pat
127
  return storyboard
128
  except Exception as e: raise gr.Error(f"O Roteirista (Gemini) falhou ao criar o roteiro: {e}. Resposta recebida: {response.text}")
129
 
130
- def get_dreamo_prompt_for_transition(previous_image_path: str, target_scene_description: str) -> str:
131
- if not GEMINI_API_KEY: raise gr.Error("Chave da API Gemini nΓ£o configurada!")
132
- genai.configure(api_key=GEMINI_API_KEY)
133
- prompt_file = "prompts/img2img_evolution_prompt.txt"
134
- with open(os.path.join(os.path.dirname(__file__), prompt_file), "r", encoding="utf-8") as f: template = f.read()
135
- director_prompt = template.format(target_scene_description=target_scene_description)
136
- model = genai.GenerativeModel('gemini-1.5-flash'); img = Image.open(previous_image_path)
137
- response = model.generate_content([director_prompt, "Previous Image:", img])
138
- return response.text.strip().replace("\"", "")
139
-
140
- @spaces.GPU(duration=180)
141
- def run_keyframe_generation(storyboard, ref_images_tasks, progress=gr.Progress()):
142
- if not ENABLE_MODELS or dreamo_generator_singleton is None:
143
- raise gr.Error("Modo Demo Ativado! Para gerar imagens, clone este Space, mude a variΓ‘vel 'ENABLE_MODELS' para True no arquivo app.py e use hardware de GPU.")
144
-
 
 
 
145
  if not storyboard: raise gr.Error("Nenhum roteiro para gerar keyframes.")
146
- initial_ref_image_path = ref_images_tasks[0]['image']
147
- if not initial_ref_image_path or not os.path.exists(initial_ref_image_path): raise gr.Error("A imagem de referΓͺncia principal (Γ  esquerda) Γ© obrigatΓ³ria.")
148
- log_history = ""; generated_images_for_gallery = []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
149
  try:
150
- dreamo_generator_singleton.to_gpu()
151
- with Image.open(initial_ref_image_path) as img: width, height = (img.width // 32) * 32, (img.height // 32) * 32
152
- keyframe_paths, current_ref_image_path = [initial_ref_image_path], initial_ref_image_path
153
  for i, scene_description in enumerate(storyboard):
154
- progress(i / len(storyboard), desc=f"Pintando Keyframe {i+1}/{len(storyboard)}")
155
- log_history += f"\n--- PINTANDO KEYFRAME {i+1}/{len(storyboard)} ---\n"
156
- dreamo_prompt = get_dreamo_prompt_for_transition(current_ref_image_path, scene_description)
157
- reference_items = []
158
- fixed_references_basenames = [os.path.basename(item['image']) for item in ref_images_tasks if item['image']]
159
- for item in ref_images_tasks:
160
- if item['image']:
161
- reference_items.append({'image_np': np.array(Image.open(item['image']).convert("RGB")), 'task': item['task']})
162
- dynamic_references_paths = keyframe_paths[-3:]
163
- for ref_path in dynamic_references_paths:
164
- if os.path.basename(ref_path) not in fixed_references_basenames:
165
- reference_items.append({'image_np': np.array(Image.open(ref_path).convert("RGB")), 'task': 'ip'})
166
- log_history += f" - Roteiro: '{scene_description}'\n - Usando {len(reference_items)} referΓͺncias visuais.\n - Prompt do D.A.: \"{dreamo_prompt}\"\n"
167
- yield {keyframe_log_output: gr.update(value=log_history), keyframe_gallery_output: gr.update(value=generated_images_for_gallery)}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
168
  output_path = os.path.join(WORKSPACE_DIR, f"keyframe_{i+1}.png")
169
- image = dreamo_generator_singleton.generate_image_with_gpu_management(reference_items=reference_items, prompt=dreamo_prompt, width=width, height=height)
 
 
 
 
 
 
170
  image.save(output_path)
171
- keyframe_paths.append(output_path); generated_images_for_gallery.append(output_path); current_ref_image_path = output_path
172
- yield {keyframe_log_output: gr.update(value=log_history), keyframe_gallery_output: gr.update(value=generated_images_for_gallery)}
173
- except Exception as e: raise gr.Error(f"O Pintor (DreamO) ou Diretor de Arte (Gemini) falhou: {e}")
174
- finally:
175
- if ENABLE_MODELS:
176
- dreamo_generator_singleton.to_cpu()
177
- gc.collect()
178
- torch.cuda.empty_cache()
179
- log_history += "\nPintura de todos os keyframes concluΓ­da.\n"
180
- yield {keyframe_log_output: gr.update(value=log_history), keyframe_gallery_output: gr.update(value=generated_images_for_gallery), keyframe_images_state: keyframe_paths}
181
 
182
  def get_initial_motion_prompt(user_prompt: str, start_image_path: str, destination_image_path: str, dest_scene_desc: str):
 
 
 
 
 
 
 
 
 
 
 
 
 
183
  if not GEMINI_API_KEY: raise gr.Error("Chave da API Gemini nΓ£o configurada!")
184
  try:
185
- genai.configure(api_key=GEMINI_API_KEY); model = genai.GenerativeModel('gemini-1.5-flash'); prompt_file = "prompts/initial_motion_prompt.txt"
186
  with open(os.path.join(os.path.dirname(__file__), prompt_file), "r", encoding="utf-8") as f: template = f.read()
187
  cinematographer_prompt = template.format(user_prompt=user_prompt, destination_scene_description=dest_scene_desc)
188
  start_img, dest_img = Image.open(start_image_path), Image.open(destination_image_path)
@@ -191,363 +440,300 @@ def get_initial_motion_prompt(user_prompt: str, start_image_path: str, destinati
191
  return response.text.strip()
192
  except Exception as e: raise gr.Error(f"O Cineasta de IA (Inicial) falhou: {e}. Resposta: {getattr(e, 'text', 'No text available.')}")
193
 
194
- def get_dynamic_motion_prompt(user_prompt, story_history, memory_media_path, path_image_path, destination_image_path, path_scene_desc, dest_scene_desc):
 
 
 
 
 
 
 
 
 
 
 
195
  if not GEMINI_API_KEY: raise gr.Error("Chave da API Gemini nΓ£o configurada!")
196
  try:
197
- genai.configure(api_key=GEMINI_API_KEY); model = genai.GenerativeModel('gemini-1.5-flash'); prompt_file = "prompts/dynamic_motion_prompt.txt"
198
  with open(os.path.join(os.path.dirname(__file__), prompt_file), "r", encoding="utf-8") as f: template = f.read()
199
- cinematographer_prompt = template.format(user_prompt=user_prompt, story_history=story_history, midpoint_scene_description=path_scene_desc, destination_scene_description=dest_scene_desc)
200
- with imageio.get_reader(memory_media_path) as reader:
201
- mem_img = Image.fromarray(reader.get_data(0))
202
  path_img, dest_img = Image.open(path_image_path), Image.open(destination_image_path)
203
- model_contents = ["START Image (from Kinetic Echo):", mem_img, "MIDPOINT Image (Path):", path_img, "DESTINATION Image (Destination):", dest_img, cinematographer_prompt]
204
  response = model.generate_content(model_contents)
205
- return response.text.strip()
206
- except Exception as e: raise gr.Error(f"O Cineasta de IA (DinΓ’mico) falhou: {e}. Resposta: {getattr(e, 'text', 'No text available.')}")
 
 
207
 
208
- @spaces.GPU(duration=360)
209
  def run_video_production(
 
210
  video_duration_seconds, video_fps, eco_video_frames, use_attention_slicing,
211
- fragment_duration_frames, mid_cond_strength, num_inference_steps,
212
- prompt_geral, keyframe_images_state, scene_storyboard, cfg,
 
213
  progress=gr.Progress()
214
  ):
215
- if not ENABLE_MODELS or pipeline_instance is None:
216
- raise gr.Error("Modo Demo Ativado! Para gerar vΓ­deos, clone este Space, mude a variΓ‘vel 'ENABLE_MODELS' para True no arquivo app.py e use hardware de GPU.")
217
-
218
- video_total_frames = int(video_duration_seconds * video_fps)
219
- if not keyframe_images_state or len(keyframe_images_state) < 3: raise gr.Error("Pinte pelo menos 2 keyframes para produzir uma transiΓ§Γ£o.")
220
- if int(fragment_duration_frames) > video_total_frames:
221
- raise gr.Error(f"A 'DuraΓ§Γ£o de Cada Fragmento' ({fragment_duration_frames} frames) nΓ£o pode ser maior que a 'DuraΓ§Γ£o da GeraΓ§Γ£o Bruta' ({video_total_frames} frames).")
222
-
223
- log_history = "\n--- FASE 3/4: Iniciando ProduΓ§Γ£o (Eco + DΓ©jΓ  Vu)...\n"
224
- yield {
225
- production_log_output: log_history, video_gallery_glitch: [],
226
- prod_media_start_output: gr.update(value=None),
227
- prod_media_mid_output: gr.update(value=None, visible=False),
228
- prod_media_end_output: gr.update(value=None),
229
- }
230
-
231
- seed = int(time.time())
232
- target_device = 'cuda' if torch.cuda.is_available() else 'cpu'
233
  try:
234
- pipeline_instance.to(target_device)
235
- video_fragments, story_history = [], ""; kinetic_memory_path = None
236
- with Image.open(keyframe_images_state[1]) as img: width, height = img.size
 
 
 
 
 
 
 
 
 
 
237
 
238
- num_transitions = len(keyframe_images_state) - 2
239
  for i in range(num_transitions):
240
  fragment_num = i + 1
241
- progress(i / num_transitions, desc=f"Preparando Fragmento {fragment_num}...")
242
  log_history += f"\n--- FRAGMENTO {fragment_num}/{num_transitions} ---\n"
 
243
 
244
- if i == 0:
245
- start_path, destination_path = keyframe_images_state[1], keyframe_images_state[2]
246
- dest_scene_desc = scene_storyboard[1]
247
- log_history += f" - InΓ­cio (Big Bang): {os.path.basename(start_path)}\n - Destino: {os.path.basename(destination_path)}\n"
248
  current_motion_prompt = get_initial_motion_prompt(prompt_geral, start_path, destination_path, dest_scene_desc)
249
- conditioning_items_data = [(start_path, 0, 1.0), (destination_path, int(video_total_frames), 1.0)]
250
- yield {
251
- production_log_output: gr.update(value=log_history),
252
- prod_media_start_output: gr.update(value=start_path),
253
- prod_media_mid_output: gr.update(value=None, visible=False),
254
- prod_media_end_output: gr.update(value=destination_path),
255
- }
256
  else:
257
- memory_path, path_path, destination_path = kinetic_memory_path, keyframe_images_state[i+1], keyframe_images_state[i+2]
258
- path_scene_desc, dest_scene_desc = scene_storyboard[i], scene_storyboard[i+1]
259
- log_history += f" - MemΓ³ria CinΓ©tica (VΓ­deo): {os.path.basename(memory_path)}\n - Caminho: {os.path.basename(path_path)}\n - Destino: {os.path.basename(destination_path)}\n"
260
- mid_cond_frame_calculated = int(video_total_frames - fragment_duration_frames + eco_video_frames)
261
- log_history += f" - Frame de Condicionamento do 'Caminho' calculado: {mid_cond_frame_calculated}\n"
262
- current_motion_prompt = get_dynamic_motion_prompt(prompt_geral, story_history, memory_path, path_path, destination_path, path_scene_desc, dest_scene_desc)
263
- conditioning_items_data = [(memory_path, 0, 1.0), (path_path, mid_cond_frame_calculated, mid_cond_strength), (destination_path, int(video_total_frames), 1.0)]
264
- yield {
265
- production_log_output: gr.update(value=log_history),
266
- prod_media_start_output: gr.update(value=memory_path),
267
- prod_media_mid_output: gr.update(value=path_path, visible=True),
268
- prod_media_end_output: gr.update(value=destination_path),
269
- }
270
 
271
  story_history += f"\n- Ato {fragment_num + 1}: {current_motion_prompt}"
272
  log_history += f" - InstruΓ§Γ£o do Cineasta: '{current_motion_prompt}'\n"; yield {production_log_output: log_history}
273
 
274
- progress(i / num_transitions, desc=f"Filmando Fragmento {fragment_num}...")
275
- full_fragment_path, actual_frames_generated = run_ltx_animation(
276
- current_fragment_index=fragment_num, motion_prompt=current_motion_prompt,
277
- conditioning_items_data=conditioning_items_data, width=width, height=height,
278
- seed=seed, cfg=cfg, progress=progress,
279
- video_total_frames=video_total_frames, video_fps=video_fps,
280
- use_attention_slicing=use_attention_slicing, num_inference_steps=num_inference_steps
 
281
  )
282
- log_history += f" - LOG: Gerei o fragmento_{fragment_num} bruto com {actual_frames_generated} frames.\n"
283
- yield {production_log_output: log_history}
284
- trimmed_fragment_path = os.path.join(WORKSPACE_DIR, f"fragment_{fragment_num}_trimmed.mp4")
285
- trim_video_to_frames(full_fragment_path, trimmed_fragment_path, int(fragment_duration_frames))
286
- log_history += f" - LOG: Reduzi o fragmento_{fragment_num} para {int(fragment_duration_frames)} frames.\n"
287
- yield {production_log_output: log_history}
288
  is_last_fragment = (i == num_transitions - 1)
289
- if not is_last_fragment:
 
 
 
 
 
 
 
 
 
 
 
290
  eco_output_path = os.path.join(WORKSPACE_DIR, f"eco_from_frag_{fragment_num}.mp4")
291
  kinetic_memory_path = extract_last_n_frames_as_video(trimmed_fragment_path, eco_output_path, int(eco_video_frames))
292
- log_history += f" - LOG: Gerei o eco com {int(eco_video_frames)} frames a partir do final do fragmento reduzido.\n"
293
- log_history += f" - Novo Eco CinΓ©tico (VΓ­deo) criado: {os.path.basename(kinetic_memory_path)}\n"
294
- else:
295
- log_history += f" - Este Γ© o ΓΊltimo fragmento, nΓ£o Γ© necessΓ‘rio gerar um eco.\n"
296
- video_fragments.append(trimmed_fragment_path)
297
- yield {production_log_output: log_history, video_gallery_glitch: video_fragments}
298
- progress(1.0, desc="ProduΓ§Γ£o ConcluΓ­da.")
299
- log_history += "\nProduΓ§Γ£o de todos os fragmentos concluΓ­da.\n"
300
- yield {production_log_output: log_history, video_gallery_glitch: video_fragments, fragment_list_state: video_fragments}
301
- finally:
302
- if ENABLE_MODELS:
303
- pipeline_instance.to('cpu')
304
- gc.collect()
305
- torch.cuda.empty_cache()
306
-
307
- def process_image_to_square(image_path: str, size: int = TARGET_RESOLUTION) -> str:
308
- if not image_path: return None
309
- try:
310
- img = Image.open(image_path).convert("RGB"); img_square = ImageOps.fit(img, (size, size), Image.Resampling.LANCZOS)
311
- output_path = os.path.join(WORKSPACE_DIR, f"initial_ref_{size}x{size}.png"); img_square.save(output_path)
312
- return output_path
313
- except Exception as e: raise gr.Error(f"Falha ao processar a imagem de referΓͺncia: {e}")
314
 
315
- def load_conditioning_tensor(media_path: str, height: int, width: int) -> torch.Tensor:
316
- if not ENABLE_MODELS: return None
317
- if media_path.lower().endswith(('.mp4', '.mov', '.avi')):
318
- with imageio.get_reader(media_path) as reader:
319
- first_frame_np = reader.get_data(0)
320
- temp_img_path = os.path.join(WORKSPACE_DIR, f"temp_frame_from_{os.path.basename(media_path)}.png")
321
- Image.fromarray(first_frame_np).save(temp_img_path)
322
- return load_image_to_tensor_with_resize_and_crop(temp_img_path, height, width)
323
- else:
324
- return load_image_to_tensor_with_resize_and_crop(media_path, height, width)
325
-
326
- def run_ltx_animation(
327
- current_fragment_index, motion_prompt, conditioning_items_data,
328
- width, height, seed, cfg, progress,
329
- video_total_frames, video_fps, use_attention_slicing, num_inference_steps
330
- ):
331
- if not ENABLE_MODELS: return None, 0
332
- progress(0, desc=f"[CΓ’mera LTX] Filmando Cena {current_fragment_index}...");
333
- output_path = os.path.join(WORKSPACE_DIR, f"fragment_{current_fragment_index}_full.mp4")
334
- target_device = pipeline_instance.device
335
- try:
336
- if use_attention_slicing: pipeline_instance.enable_attention_slicing()
337
- conditioning_items = [ConditioningItem(load_conditioning_tensor(p, height, width).to(target_device), s, t) for p, s, t in conditioning_items_data]
338
- actual_num_frames = int(round((float(video_total_frames) - 1.0) / 8.0) * 8 + 1)
339
- padded_h, padded_w = ((height - 1) // 32 + 1) * 32, ((width - 1) // 32 + 1) * 32
340
- padding_vals = calculate_padding(height, width, padded_h, padded_w)
341
- for item in conditioning_items: item.media_item = torch.nn.functional.pad(item.media_item, padding_vals)
342
- first_pass_config = PIPELINE_CONFIG_YAML.get("first_pass", {}).copy()
343
- first_pass_config['num_inference_steps'] = int(num_inference_steps)
344
- kwargs = {"prompt": motion_prompt, "negative_prompt": "blurry, distorted, bad quality, artifacts", "height": padded_h, "width": padded_w, "num_frames": actual_num_frames, "frame_rate": video_fps, "generator": torch.Generator(device=target_device).manual_seed(int(seed) + current_fragment_index), "output_type": "pt", "guidance_scale": float(cfg), "timesteps": first_pass_config.get("timesteps"), "conditioning_items": conditioning_items, "decode_timestep": PIPELINE_CONFIG_YAML.get("decode_timestep"), "decode_noise_scale": PIPELINE_CONFIG_YAML.get("decode_noise_scale"), "stochastic_sampling": PIPELINE_CONFIG_YAML.get("stochastic_sampling"), "image_cond_noise_scale": 0.15, "is_video": True, "vae_per_channel_normalize": True, "mixed_precision": (PIPELINE_CONFIG_YAML.get("precision") == "mixed_precision"), "enhance_prompt": False, "decode_every": 4, "num_inference_steps": int(num_inference_steps)}
345
- result_tensor = pipeline_instance(**kwargs).images
346
- pad_l, pad_r, pad_t, pad_b = map(int, padding_vals); slice_h = -pad_b if pad_b > 0 else None; slice_w = -pad_r if pad_r > 0 else None
347
- cropped_tensor = result_tensor[:, :, :actual_num_frames, pad_t:slice_h, pad_l:slice_w]
348
- video_np = (cropped_tensor[0].permute(1, 2, 3, 0).cpu().float().numpy() * 255).astype(np.uint8)
349
- with imageio.get_writer(output_path, fps=video_fps, codec='libx264', quality=8) as writer:
350
- for i, frame in enumerate(video_np): writer.append_data(frame)
351
- return output_path, actual_num_frames
352
- finally:
353
- if ENABLE_MODELS and use_attention_slicing:
354
- pipeline_instance.disable_attention_slicing()
355
 
356
- def trim_video_to_frames(input_path: str, output_path: str, frames_to_keep: int) -> str:
357
- try:
358
- subprocess.run(f"ffmpeg -y -v error -i \"{input_path}\" -vf \"select='lt(n,{frames_to_keep})'\" -an \"{output_path}\"", shell=True, check=True, text=True)
359
- return output_path
360
- except subprocess.CalledProcessError as e: raise gr.Error(f"FFmpeg falhou ao cortar vΓ­deo: {e.stderr}")
 
 
 
361
 
362
- def extract_last_n_frames_as_video(input_path: str, output_path: str, n_frames: int) -> str:
363
- try:
364
- cmd_probe = f"ffprobe -v error -select_streams v:0 -count_frames -show_entries stream=nb_read_frames -of default=nokey=1:noprint_wrappers=1 \"{input_path}\""
365
- result = subprocess.run(cmd_probe, shell=True, check=True, text=True, capture_output=True)
366
- total_frames = int(result.stdout.strip())
367
- if n_frames >= total_frames:
368
- shutil.copyfile(input_path, output_path)
369
- return output_path
370
- start_frame = total_frames - n_frames
371
- cmd_ffmpeg = f"ffmpeg -y -v error -i \"{input_path}\" -vf \"select='gte(n,{start_frame})'\" -vframes {n_frames} -an \"{output_path}\""
372
- subprocess.run(cmd_ffmpeg, shell=True, check=True, text=True)
373
- return output_path
374
- except (subprocess.CalledProcessError, ValueError) as e:
375
- raise gr.Error(f"FFmpeg falhou ao extrair os ΓΊltimos {n_frames} frames: {getattr(e, 'stderr', str(e))}")
376
 
377
- def concatenate_and_trim_masterpiece(fragment_paths: list, fragment_duration_frames: int, eco_video_frames: int, progress=gr.Progress()):
378
- if not fragment_paths: raise gr.Error("Nenhum fragmento de vΓ­deo para concatenar.")
379
- progress(0.1, desc="Preparando fragmentos para montagem final...");
380
- try:
381
- list_file_path = os.path.join(WORKSPACE_DIR, "concat_list.txt")
382
- final_output_path = os.path.join(WORKSPACE_DIR, "masterpiece_final.mp4")
383
- temp_files_for_concat = []
384
- final_clip_len = int(fragment_duration_frames - eco_video_frames)
385
- for i, p in enumerate(fragment_paths):
386
- if i == len(fragment_paths) - 1:
387
- temp_files_for_concat.append(os.path.abspath(p))
388
- progress(0.1 + (i / len(fragment_paths)) * 0.8, desc=f"Mantendo ΓΊltimo fragmento: {os.path.basename(p)}")
389
- else:
390
- temp_path = os.path.join(WORKSPACE_DIR, f"temp_concat_{i}.mp4")
391
- progress(0.1 + (i / len(fragment_paths)) * 0.8, desc=f"Cortando {os.path.basename(p)} para {final_clip_len} frames")
392
- trim_video_to_frames(p, temp_path, final_clip_len)
393
- temp_files_for_concat.append(temp_path)
394
- progress(0.9, desc="Concatenando clipes...")
395
- with open(list_file_path, "w") as f:
396
- for p_temp in temp_files_for_concat: f.write(f"file '{p_temp}'\n")
397
- subprocess.run(f"ffmpeg -y -v error -f concat -safe 0 -i \"{list_file_path}\" -c copy \"{final_output_path}\"", shell=True, check=True, text=True)
398
- progress(1.0, desc="Montagem concluΓ­da!")
399
- return final_output_path
400
- except subprocess.CalledProcessError as e:
401
- raise gr.Error(f"FFmpeg falhou na concatenaΓ§Γ£o final: {e.stderr}")
402
-
403
- # --- Ato 5: A Interface com o Mundo (UI) ---
404
  with gr.Blocks(theme=gr.themes.Soft()) as demo:
405
- gr.Markdown("# NOVIM-6.2 (Painel de Controle do Diretor)\n*By Carlex & Gemini & DreamO - VersΓ£o de DemonstraΓ§Γ£o*")
406
-
407
- if not ENABLE_MODELS:
408
- gr.Warning(
409
- """
410
- **MODO DE DEMONSTRAÇÃO ATIVADO**
411
- VocΓͺ pode explorar a interface e usar a "Etapa 1: Gerar Roteiro" se tiver uma chave da API Gemini configurada.
412
- Para habilitar a geraΓ§Γ£o de imagens e vΓ­deos (Etapas 2 e 3), vocΓͺ precisa:
413
- 1. **Fork este Space:** Clique no menu de trΓͺs pontos ao lado do tΓ­tulo e selecione "Duplicate this Space".
414
- 2. **Escolha um Hardware de GPU:** Na tela de duplicaΓ§Γ£o, selecione um hardware de GPU (ex: T4 Small).
415
- 3. **Edite o `app.py`:** Na aba "Files" do seu novo Space, edite o arquivo `app.py`.
416
- 4. **Ative os Modelos:** Mude a linha `ENABLE_MODELS = False` para `ENABLE_MODELS = True`.
417
- 5. Salve o arquivo. O Space serΓ‘ reiniciado com a funcionalidade completa.
418
- """
419
- )
420
 
421
  if os.path.exists(WORKSPACE_DIR): shutil.rmtree(WORKSPACE_DIR)
422
  os.makedirs(WORKSPACE_DIR); Path("prompts").mkdir(exist_ok=True)
423
-
424
- scene_storyboard_state, keyframe_images_state, fragment_list_state = gr.State([]), gr.State([]), gr.State([])
425
- prompt_geral_state, processed_ref_path_state = gr.State(""), gr.State("")
 
 
 
 
 
 
 
 
 
 
 
 
426
 
427
  gr.Markdown("--- \n ## ETAPA 1: O ROTEIRO (IA Roteirista)")
428
  with gr.Row():
429
  with gr.Column(scale=1):
430
  prompt_input = gr.Textbox(label="Ideia Geral (Prompt)")
431
- num_fragments_input = gr.Slider(2, 5, 4, step=1, label="NΓΊmero de Atos (Keyframes)")
432
- image_input = gr.Image(type="filepath", label=f"Imagem de ReferΓͺncia Principal (serΓ‘ {TARGET_RESOLUTION}x{TARGET_RESOLUTION})")
 
 
 
 
433
  director_button = gr.Button("▢️ 1. Gerar Roteiro", variant="primary")
434
  with gr.Column(scale=2): storyboard_to_show = gr.JSON(label="Roteiro de Cenas Gerado (em InglΓͺs)")
435
 
436
- gr.Markdown("--- \n ## ETAPA 2: OS KEYFRAMES (IA Pintor & Diretor de Arte)")
437
  with gr.Row():
438
  with gr.Column(scale=2):
439
- gr.Markdown("ForneΓ§a referΓͺncias para guiar a IA. A Principal Γ© obrigatΓ³ria. A SecundΓ‘ria Γ© opcional (ex: para estilo ou uma segunda pessoa).")
440
- with gr.Row():
441
- with gr.Column():
442
- ref1_image = gr.Image(label="ReferΓͺncia Principal (ConteΓΊdo/ID)", type="filepath")
443
- ref1_task = gr.Dropdown(choices=["ip", "id", "style"], value="ip", label="Tarefa da Ref. Principal")
444
- with gr.Column():
445
- ref2_image = gr.Image(label="ReferΓͺncia SecundΓ‘ria (Opcional)", type="filepath")
446
- ref2_task = gr.Dropdown(choices=["ip", "id", "style"], value="style", label="Tarefa da Ref. SecundΓ‘ria")
447
- photographer_button = gr.Button("▢️ 2. Pintar Imagens-Chave em Cadeia", variant="primary")
448
- with gr.Column(scale=1):
449
- keyframe_log_output = gr.Textbox(label="DiΓ‘rio de Bordo do Pintor", lines=15, interactive=False)
450
- keyframe_gallery_output = gr.Gallery(label="Imagens-Chave Pintadas", object_fit="contain", height="auto", type="filepath")
451
 
452
  gr.Markdown("--- \n ## ETAPA 3: A PRODUÇÃO (IA Cineasta & CΓ’mera)")
453
  with gr.Row():
454
  with gr.Column(scale=1):
455
- cfg_slider = gr.Slider(1.0, 10.0, 2.5, step=0.1, label="CFG")
456
  with gr.Accordion("Controles AvanΓ§ados de Timing e Performance", open=False):
457
- video_duration_slider = gr.Slider(label="DuraΓ§Γ£o da GeraΓ§Γ£o Bruta (segundos)", minimum=2.0, maximum=10.0, value=6.0, step=0.5)
458
- video_fps_slider = gr.Slider(label="FPS do VΓ­deo", minimum=12, maximum=30, value=30, step=1)
459
- num_inference_steps_slider = gr.Slider(label="Etapas de InferΓͺncia", minimum=10, maximum=50, value=30, step=1)
460
  slicing_checkbox = gr.Checkbox(label="Usar Attention Slicing (Economiza VRAM)", value=True)
461
  gr.Markdown("---"); gr.Markdown("#### Controles de DuraΓ§Γ£o (Arquitetura Eco + DΓ©jΓ  Vu)")
462
- fragment_duration_slider = gr.Slider(label="DuraΓ§Γ£o de Cada Fragmento (Frames)", minimum=30, maximum=300, value=90, step=1)
463
  eco_frames_slider = gr.Slider(label="Tamanho do Eco CinΓ©tico (Frames)", minimum=4, maximum=48, value=8, step=1)
464
  mid_cond_strength_slider = gr.Slider(label="ForΓ§a do 'Caminho'", minimum=0.1, maximum=1.0, value=0.5, step=0.05)
465
- gr.Markdown(
466
- """
467
- **InstruΓ§Γ΅es (Nova Arquitetura):**
468
- - **DuraΓ§Γ£o da GeraΓ§Γ£o Bruta:** Tempo total que a IA tem para criar a transiΓ§Γ£o. Deve ser MAIOR que a DuraΓ§Γ£o do Fragmento.
469
- - **DuraΓ§Γ£o de Cada Fragmento:** O comprimento final de cada clipe de vΓ­deo que serΓ‘ gerado.
470
- - **Tamanho do Eco CinΓ©tico:** Quantos frames do *final* de um fragmento serΓ£o passados para o prΓ³ximo para garantir continuidade.
471
- - **ForΓ§a do Caminho:** Define o quΓ£o forte a imagem-chave intermediΓ‘ria ('Caminho') influencia a transiΓ§Γ£o.
472
- """
473
- )
474
- animator_button = gr.Button("▢️ 3. Produzir Cenas (Handoff CinΓ©tico)", variant="primary")
475
  with gr.Accordion("VisualizaΓ§Γ£o das MΓ­dias de Condicionamento (Ao Vivo)", open=True):
476
  with gr.Row():
477
  prod_media_start_output = gr.Video(label="MΓ­dia Inicial (Eco/K1)", interactive=False)
478
  prod_media_mid_output = gr.Image(label="MΓ­dia do Caminho (K_i-1)", interactive=False, visible=False)
479
  prod_media_end_output = gr.Image(label="MΓ­dia de Destino (K_i)", interactive=False)
480
  production_log_output = gr.Textbox(label="DiΓ‘rio de Bordo da ProduΓ§Γ£o", lines=10, interactive=False)
481
- with gr.Column(scale=1): video_gallery_glitch = gr.Gallery(label="Fragmentos Gerados (VersΓ΅es Cortadas)", object_fit="contain", height="auto", type="video")
482
-
483
- fragment_duration_state = gr.State()
484
- eco_frames_state = gr.State()
485
-
486
- gr.Markdown(f"--- \n ## ETAPA 4: PΓ“S-PRODUÇÃO (Editor)")
487
- editor_button = gr.Button("▢️ 4. Montar VΓ­deo Final", variant="primary")
488
- final_video_output = gr.Video(label="A Obra-Prima Final", width=TARGET_RESOLUTION)
489
 
490
  gr.Markdown(
491
  """
492
  ---
493
- ### A Arquitetura: Eco + DΓ©jΓ  Vu
494
- A geraΓ§Γ£o comeΓ§a com um "Big Bang" entre os dois primeiros keyframes. A partir daΓ­, a mΓ‘gica acontece.
495
- * **O Eco (A MemΓ³ria FΓ­sica):** No final de cada cena, os ΓΊltimos frames sΓ£o capturados e salvos como um pequeno vΓ­deo, o `Eco`. Ele carrega a "energia cinΓ©tica" do movimento, iluminaΓ§Γ£o e atmosfera da cena que acabou.
496
- * **O DΓ©jΓ  Vu (A MemΓ³ria Conceitual):** Para criar a prΓ³xima cena, o Cineasta de IA (Gemini) assiste ao `Eco`, olha para o keyframe do "caminho" e o keyframe do "destino". Com essa visΓ£o tripla, ele tem um "dΓ©jΓ  vu", uma memΓ³ria do que acabou de acontecer que o inspira a escrever uma instruΓ§Γ£o de cΓ’mera precisa para conectar o passado ao futuro de forma fluida e coerente.
 
 
497
  """
498
  )
499
-
500
- # --- Ato 6: A RegΓͺncia (LΓ³gica de ConexΓ£o dos BotΓ΅es) ---
501
- def process_and_update_storyboard(num_fragments, prompt, image_path):
502
- processed_path = process_image_to_square(image_path)
503
- if not processed_path: raise gr.Error("A imagem de referΓͺncia Γ© invΓ‘lida ou nΓ£o foi fornecida.")
504
- storyboard = run_storyboard_generation(num_fragments, prompt, processed_path)
505
- return storyboard, prompt, processed_path
 
 
 
 
 
 
 
506
 
507
  director_button.click(
508
- fn=process_and_update_storyboard,
509
- inputs=[num_fragments_input, prompt_input, image_input],
510
- outputs=[scene_storyboard_state, prompt_geral_state, processed_ref_path_state]
511
- ).success(
512
- fn=lambda s, p: (s, p),
513
- inputs=[scene_storyboard_state, processed_ref_path_state],
514
- outputs=[storyboard_to_show, ref1_image]
515
- )
516
-
517
- @photographer_button.click(
518
- inputs=[scene_storyboard_state, ref1_image, ref1_task, ref2_image, ref2_task],
519
  outputs=[keyframe_log_output, keyframe_gallery_output, keyframe_images_state]
520
  )
521
- def run_keyframe_generation_wrapper(storyboard, ref1_img, ref1_tsk, ref2_img, ref2_tsk, progress=gr.Progress()):
522
- ref_data = [
523
- {'image': ref1_img, 'task': ref1_tsk},
524
- {'image': ref2_img, 'task': ref2_tsk}
525
- ]
526
- yield from run_keyframe_generation(storyboard, ref_data, progress)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
527
 
528
  animator_button.click(
529
- fn=lambda frag_dur, eco_dur: (frag_dur, eco_dur),
530
- inputs=[fragment_duration_slider, eco_frames_slider],
531
- outputs=[fragment_duration_state, eco_frames_state]
532
- ).then(
533
- fn=run_video_production,
534
  inputs=[
535
- video_duration_slider, video_fps_slider, eco_frames_slider, slicing_checkbox,
536
- fragment_duration_slider, mid_cond_strength_slider,
537
- num_inference_steps_slider,
 
538
  prompt_geral_state, keyframe_images_state, scene_storyboard_state, cfg_slider
539
  ],
540
  outputs=[
541
- production_log_output, video_gallery_glitch, fragment_list_state,
542
- prod_media_start_output, prod_media_mid_output, prod_media_end_output
 
543
  ]
544
  )
545
 
546
  editor_button.click(
547
- fn=concatenate_and_trim_masterpiece,
548
- inputs=[fragment_list_state, fragment_duration_state, eco_frames_state],
549
  outputs=[final_video_output]
550
  )
551
 
552
  if __name__ == "__main__":
 
 
 
553
  demo.queue().launch(server_name="0.0.0.0", share=True)
 
4
  # Contato:
5
  # Carlos Rodrigues dos Santos
6
 
7
  #
8
  # RepositΓ³rios e Projetos Relacionados:
9
  # GitHub: https://github.com/carlex22/Aduc-sdr
10
+ # YouTube (Resultados): https://m.youtube.com/channel/UC3EgoJi_Fv7yuDpvfYNtoIQ
11
+ # Hugging Face: https://huggingface.co/spaces/Carlexx/ADUC-Sdr_Gemini_Drem0_Ltx_Video60seconds/
12
  #
13
  # Este programa Γ© software livre: vocΓͺ pode redistribuΓ­-lo e/ou modificΓ‘-lo
14
  # sob os termos da LicenΓ§a PΓΊblica Geral Affero da GNU como publicada pela
 
23
  # VocΓͺ deve ter recebido uma cΓ³pia da LicenΓ§a PΓΊblica Geral Affero da GNU
24
  # junto com este programa. Se nΓ£o, veja <https://www.gnu.org/licenses/>.
25
 
26
+ # --- app.py (ADUC-SDR-2.9: Diretor de Cena com Prompt Único e Extração) ---
27
 
 
28
  import gradio as gr
29
  import torch
30
  import os
31
+ import re
32
  import yaml
33
  from PIL import Image, ImageOps, ExifTags
34
  import shutil
 
35
  import subprocess
36
  import google.generativeai as genai
37
  import numpy as np
38
  import imageio
39
  from pathlib import Path
 
40
  import json
41
  import time
42
+ import math
43
+
44
+ os.environ["TOKENIZERS_PARALLELISM"] = "false"
45
+
46
+ from flux_kontext_helpers import flux_kontext_singleton
47
+ from ltx_manager_helpers import ltx_manager_singleton
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48
 
49
  WORKSPACE_DIR = "aduc_workspace"
50
  GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY")
 
 
 
51
 
52
+ # ======================================================================================
53
+ # SEÇÃO 1: FUNÇÕES UTILITÁRIAS E DE PROCESSAMENTO DE MÍDIA
54
+ # ======================================================================================
55
 
56
  def robust_json_parser(raw_text: str) -> dict:
57
+ """
58
+ Analisa uma string de texto bruto para encontrar e decodificar o primeiro objeto JSON vΓ‘lido.
59
+ Γ‰ essencial para extrair respostas estruturadas de modelos de linguagem.
60
+
61
+ Args:
62
+ raw_text (str): A string completa retornada pela IA.
63
+
64
+ Returns:
65
+ dict: Um dicionΓ‘rio Python representando o objeto JSON.
66
+
67
+ Raises:
68
+ ValueError: Se nenhum objeto JSON vΓ‘lido for encontrado ou a decodificaΓ§Γ£o falhar.
69
+ """
70
+ clean_text = raw_text.strip()
71
  try:
72
+ start_index = clean_text.find('{'); end_index = clean_text.rfind('}')
73
  if start_index != -1 and end_index != -1 and end_index > start_index:
74
+ json_str = clean_text[start_index : end_index + 1]
75
+ return json.loads(json_str)
76
  else: raise ValueError("Nenhum objeto JSON vΓ‘lido encontrado na resposta da IA.")
77
  except json.JSONDecodeError as e: raise ValueError(f"Falha ao decodificar JSON: {e}")
78
 
79
+ def process_image_to_square(image_path: str, size: int, output_filename: str = None) -> str:
80
+ """
81
+ Processa uma imagem para um formato quadrado, redimensionando e cortando centralmente.
82
+
83
+ Args:
84
+ image_path (str): Caminho para a imagem de entrada.
85
+ size (int): A dimensΓ£o (altura e largura) da imagem de saΓ­da.
86
+ output_filename (str, optional): Nome do arquivo de saΓ­da.
87
+
88
+ Returns:
89
+ str: O caminho para a imagem processada.
90
+ """
91
+ if not image_path: return None
92
+ try:
93
+ img = Image.open(image_path).convert("RGB")
94
+ img_square = ImageOps.fit(img, (size, size), Image.Resampling.LANCZOS)
95
+ if output_filename: output_path = os.path.join(WORKSPACE_DIR, output_filename)
96
+ else: output_path = os.path.join(WORKSPACE_DIR, f"edited_ref_{time.time()}.png")
97
+ img_square.save(output_path)
98
+ return output_path
99
+ except Exception as e: raise gr.Error(f"Falha ao processar a imagem de referΓͺncia: {e}")
100
+
101
+ def trim_video_to_frames(input_path: str, output_path: str, frames_to_keep: int) -> str:
102
+ """
103
+ Usa o FFmpeg para cortar um vΓ­deo, mantendo um nΓΊmero especΓ­fico de frames do inΓ­cio.
104
+
105
+ Args:
106
+ input_path (str): Caminho para o vΓ­deo de entrada.
107
+ output_path (str): Caminho para salvar o vΓ­deo cortado.
108
+ frames_to_keep (int): NΓΊmero de frames a serem mantidos.
109
+
110
+ Returns:
111
+ str: O caminho para o vΓ­deo cortado.
112
+ """
113
+ try:
114
+ subprocess.run(f"ffmpeg -y -v error -i \"{input_path}\" -vf \"select='lt(n,{frames_to_keep})'\" -an \"{output_path}\"", shell=True, check=True, text=True)
115
+ return output_path
116
+ except subprocess.CalledProcessError as e: raise gr.Error(f"FFmpeg falhou ao cortar vΓ­deo: {e.stderr}")
117
+
118
+ def extract_last_n_frames_as_video(input_path: str, output_path: str, n_frames: int) -> str:
119
+ """
120
+ Usa o FFmpeg para extrair os ΓΊltimos N frames de um vΓ­deo para criar o "Eco CinΓ©tico".
121
+
122
+ Args:
123
+ input_path (str): Caminho para o vΓ­deo de entrada.
124
+ output_path (str): Caminho para salvar o vΓ­deo de saΓ­da (o eco).
125
+ n_frames (int): NΓΊmero de frames a serem extraΓ­dos do final.
126
+
127
+ Returns:
128
+ str: O caminho para o vΓ­deo de eco gerado.
129
+ """
130
+ try:
131
+ cmd_probe = f"ffprobe -v error -select_streams v:0 -count_frames -show_entries stream=nb_read_frames -of default=nokey=1:noprint_wrappers=1 \"{input_path}\""
132
+ result = subprocess.run(cmd_probe, shell=True, check=True, text=True, capture_output=True)
133
+ total_frames = int(result.stdout.strip())
134
+ if n_frames >= total_frames: shutil.copyfile(input_path, output_path); return output_path
135
+ start_frame = total_frames - n_frames
136
+ cmd_ffmpeg = f"ffmpeg -y -v error -i \"{input_path}\" -vf \"select='gte(n,{start_frame})'\" -vframes {n_frames} -an \"{output_path}\""
137
+ subprocess.run(cmd_ffmpeg, shell=True, check=True, text=True)
138
+ return output_path
139
+ except (subprocess.CalledProcessError, ValueError) as e: raise gr.Error(f"FFmpeg falhou ao extrair os ΓΊltimos {n_frames} frames: {getattr(e, 'stderr', str(e))}")
140
+
141
+ def concatenate_final_video(fragment_paths: list, fragment_duration_frames: int, eco_video_frames: int, progress=gr.Progress()):
142
+ """
143
+ Concatena os fragmentos de vΓ­deo gerados em uma ΓΊnica "Obra-Prima" final.
144
+ Fragmentos marcados como 'cut' (identificados pelo nome do arquivo)
145
+ nΓ£o terΓ£o sua duraΓ§Γ£o cortada para preservar a intenΓ§Γ£o do corte.
146
+
147
+ Args:
148
+ fragment_paths (list): Lista de caminhos para os fragmentos de vΓ­deo.
149
+ Cada caminho pode conter '_cut.mp4' no nome se for um corte.
150
+ fragment_duration_frames (int): A duraΓ§Γ£o esperada de cada clipe (usado apenas para
151
+ fragmentos que NÃO são cortes).
152
+ eco_video_frames (int): O tamanho da sobreposiΓ§Γ£o que deve ser cortada para fragmentos
153
+ que NÃO são cortes (usado para o 'eco').
154
+ progress (gr.Progress): Objeto do Gradio para atualizar a barra de progresso.
155
+
156
+ Returns:
157
+ str: O caminho para o vΓ­deo final montado.
158
+ """
159
+ if not fragment_paths:
160
+ raise gr.Error("Nenhum fragmento de vΓ­deo para concatenar.")
161
+
162
+ progress(0.1, desc="Preparando fragmentos para a montagem final...");
163
+
164
+ try:
165
+ list_file_path = os.path.abspath(os.path.join(WORKSPACE_DIR, f"concat_list_final_{time.time()}.txt"))
166
+ final_output_path = os.path.abspath(os.path.join(WORKSPACE_DIR, "masterpiece_final.mp4"))
167
+ temp_files_for_concat = []
168
+
169
+ # Calculamos a duração a ser mantida APENAS para fragmentos que NÃO são cortes
170
+ # Se for um corte, consideramos a duraΓ§Γ£o total do fragmento original
171
+ duration_for_non_cut_fragments = int(fragment_duration_frames - eco_video_frames)
172
+ duration_for_non_cut_fragments = max(1, duration_for_non_cut_fragments) # Garantir que seja pelo menos 1 frame
173
+
174
+ for i, p in enumerate(fragment_paths):
175
+ is_last_fragment = (i == len(fragment_paths) - 1)
176
+
177
+ # Verificamos se o nome do arquivo contΓ©m "_cut.mp4" para identificar um corte
178
+ if "_cut.mp4" in os.path.basename(p) or is_last_fragment:
179
+ # Se for um corte ou o ΓΊltimo fragmento, usamos o arquivo original sem cortar o fim
180
+ temp_files_for_concat.append(os.path.abspath(p))
181
+ # Apenas para o ΓΊltimo fragmento, garantimos que ele tambΓ©m seja considerado
182
+ if is_last_fragment and "_cut.mp4" not in os.path.basename(p):
183
+ pass # O ΓΊltimo fragmento original jΓ‘ foi adicionado
184
+ else:
185
+ # Para fragmentos que nΓ£o sΓ£o cortes e nΓ£o sΓ£o o ΓΊltimo, cortamos o fim
186
+ temp_path = os.path.join(WORKSPACE_DIR, f"final_temp_concat_{i}.mp4")
187
+ # Aqui usamos a duraΓ§Γ£o calculada para nΓ£o-cortes (fragment_duration - eco)
188
+ trim_video_to_frames(p, temp_path, duration_for_non_cut_fragments)
189
+ temp_files_for_concat.append(os.path.abspath(temp_path))
190
+
191
+ progress(0.8, desc="Concatenando clipe final...");
192
+
193
+ with open(list_file_path, "w") as f:
194
+ for p_temp in temp_files_for_concat:
195
+ f.write(f"file '{p_temp}'\n")
196
+
197
+ ffmpeg_command = f"ffmpeg -y -v error -f concat -safe 0 -i \"{list_file_path}\" -c copy \"{final_output_path}\""
198
+ subprocess.run(ffmpeg_command, shell=True, check=True, text=True)
199
+
200
+ progress(1.0, desc="Montagem final concluΓ­da!");
201
+ return final_output_path
202
+ except subprocess.CalledProcessError as e:
203
+ error_output = e.stderr if e.stderr else "Nenhuma saΓ­da de erro do FFmpeg."
204
+ raise gr.Error(f"FFmpeg falhou na concatenaΓ§Γ£o final: {error_output}")
205
+ except Exception as e:
206
+ raise gr.Error(f"Um erro ocorreu durante a concatenaΓ§Γ£o final: {e}")
207
+
208
+ def concatenate_final_video1(fragment_paths: list, fragment_duration_frames: int, eco_video_frames: int, progress=gr.Progress()):
209
+ """
210
+ Concatena os fragmentos de vΓ­deo gerados em uma ΓΊnica "Obra-Prima" final.
211
+
212
+ Args:
213
+ fragment_paths (list): Lista de caminhos para os fragmentos de vΓ­deo.
214
+ fragment_duration_frames (int): A duraΓ§Γ£o de cada clipe na montagem final.
215
+ eco_video_frames (int): O tamanho da sobreposiΓ§Γ£o que deve ser cortada.
216
+ progress (gr.Progress): Objeto do Gradio para atualizar a barra de progresso.
217
+
218
+ Returns:
219
+ str: O caminho para o vΓ­deo final montado.
220
+ """
221
+ if not fragment_paths: raise gr.Error("Nenhum fragmento de vΓ­deo para concatenar.")
222
+ progress(0.1, desc="Preparando e cortando fragmentos para a montagem final...");
223
+ try:
224
+ list_file_path = os.path.abspath(os.path.join(WORKSPACE_DIR, f"concat_list_final_{time.time()}.txt"))
225
+ final_output_path = os.path.abspath(os.path.join(WORKSPACE_DIR, "masterpiece_final.mp4"))
226
+ temp_files_for_concat = []
227
+ final_clip_len = int(fragment_duration_frames - eco_video_frames)
228
+ for i, p in enumerate(fragment_paths):
229
+ is_last_fragment = (i == len(fragment_paths) - 1)
230
+ if is_last_fragment or "_cut.mp4" in os.path.basename(p):
231
+ temp_files_for_concat.append(os.path.abspath(p))
232
+ else:
233
+ temp_path = os.path.join(WORKSPACE_DIR, f"final_temp_concat_{i}.mp4")
234
+ trim_video_to_frames(p, temp_path, final_clip_len)
235
+ temp_files_for_concat.append(os.path.abspath(temp_path))
236
+ progress(0.8, desc="Concatenando clipe final...")
237
+ with open(list_file_path, "w") as f:
238
+ for p_temp in temp_files_for_concat:
239
+ f.write(f"file '{p_temp}'\n")
240
+ ffmpeg_command = f"ffmpeg -y -v error -f concat -safe 0 -i \"{list_file_path}\" -c copy \"{final_output_path}\""
241
+ subprocess.run(ffmpeg_command, shell=True, check=True, text=True)
242
+ progress(1.0, desc="Montagem final concluΓ­da!")
243
+ return final_output_path
244
+ except subprocess.CalledProcessError as e:
245
+ error_output = e.stderr if e.stderr else "Nenhuma saΓ­da de erro do FFmpeg."
246
+ raise gr.Error(f"FFmpeg falhou na concatenaΓ§Γ£o final: {error_output}")
247
+
248
  def extract_image_exif(image_path: str) -> str:
249
+ """
250
+ Extrai metadados EXIF relevantes de uma imagem.
251
+
252
+ Args:
253
+ image_path (str): O caminho para o arquivo de imagem.
254
+
255
+ Returns:
256
+ str: Uma string formatada contendo os metadados EXIF.
257
+ """
258
  try:
259
  img = Image.open(image_path); exif_data = img._getexif()
260
  if not exif_data: return "No EXIF metadata found."
 
264
  return metadata_str if metadata_str else "No relevant EXIF metadata found."
265
  except Exception: return "Could not read EXIF data."
266
 
267
+ # ======================================================================================
268
+ # SEÇÃO 2: ORQUESTRADORES DE IA (As "Etapas" da GeraΓ§Γ£o)
269
+ # ======================================================================================
270
+
271
+ def run_storyboard_generation(num_fragments: int, prompt: str, reference_paths: list):
272
+ """
273
+ Orquestra a Etapa 1: O Roteiro.
274
+ Chama a IA (Gemini) para atuar como "Roteirista", analisando o prompt do usuΓ‘rio e
275
+ todas as imagens de referΓͺncia para criar uma narrativa coesa dividida em atos.
276
+
277
+ Args:
278
+ num_fragments (int): O nΓΊmero de keyframes (atos) a serem gerados no roteiro.
279
+ prompt (str): A ideia geral do usuΓ‘rio.
280
+ reference_paths (list): Lista de caminhos para todas as imagens de referΓͺncia fornecidas.
281
+
282
+ Returns:
283
+ list: Uma lista de strings, onde cada string Γ© a descriΓ§Γ£o de uma cena.
284
+ """
285
+ if not reference_paths: raise gr.Error("Por favor, forneΓ§a pelo menos uma imagem de referΓͺncia.")
286
+ if not GEMINI_API_KEY: raise gr.Error("Chave da API Gemini nΓ£o configurada!")
287
+ main_ref_path = reference_paths[0]
288
+ exif_metadata = extract_image_exif(main_ref_path)
289
  prompt_file = "prompts/unified_storyboard_prompt.txt"
290
  with open(os.path.join(os.path.dirname(__file__), prompt_file), "r", encoding="utf-8") as f: template = f.read()
291
  director_prompt = template.format(user_prompt=prompt, num_fragments=int(num_fragments), image_metadata=exif_metadata)
292
  genai.configure(api_key=GEMINI_API_KEY)
293
+ model = genai.GenerativeModel('gemini-2.5-flash')
294
+ model_contents = [director_prompt]
295
+ for i, img_path in enumerate(reference_paths):
296
+ model_contents.append(f"Reference Image {i+1}:")
297
+ model_contents.append(Image.open(img_path))
298
+ print(f"Gerando roteiro com {len(reference_paths)} imagens de referΓͺncia...")
299
+ response = model.generate_content(model_contents)
300
  try:
301
  storyboard_data = robust_json_parser(response.text)
302
  storyboard = storyboard_data.get("scene_storyboard", [])
 
304
  return storyboard
305
  except Exception as e: raise gr.Error(f"O Roteirista (Gemini) falhou ao criar o roteiro: {e}. Resposta recebida: {response.text}")
306
 
307
+ def run_keyframe_generation(storyboard, fixed_reference_paths, keyframe_resolution, global_prompt, progress=gr.Progress()):
308
+ """
309
+ Orquestra a Etapa 2: Os Keyframes.
310
+ A cada iteraΓ§Γ£o, chama a IA (Gemini) para atuar como "Diretor de Cena". A IA analisa
311
+ o roteiro, as referΓͺncias fixas e as ΓΊltimas 3 imagens geradas para criar um prompt
312
+ de composiΓ§Γ£o. O prompt usa tags [IMG-X] para referenciar as fontes, que sΓ£o entΓ£o
313
+ mapeadas para os arquivos reais e enviadas ao `FluxKontext` para a geraΓ§Γ£o da imagem.
314
+
315
+ Args:
316
+ storyboard (list): A lista de atos do roteiro.
317
+ fixed_reference_paths (list): Lista de caminhos para as imagens de referΓͺncia fixas.
318
+ keyframe_resolution (int): A resoluΓ§Γ£o para os keyframes a serem gerados.
319
+ global_prompt (str): A ideia geral do usuΓ‘rio para dar contexto Γ  IA.
320
+ progress (gr.Progress): Objeto do Gradio para a barra de progresso.
321
+
322
+ Yields:
323
+ dict: AtualizaΓ§Γ΅es para os componentes da UI do Gradio durante a geraΓ§Γ£o.
324
+ """
325
  if not storyboard: raise gr.Error("Nenhum roteiro para gerar keyframes.")
326
+ if not fixed_reference_paths: raise gr.Error("A imagem de referΓͺncia inicial Γ© obrigatΓ³ria.")
327
+
328
+ initial_ref_image_path = fixed_reference_paths[0]
329
+ log_history = ""; generated_images_for_gallery = []
330
+ width, height = keyframe_resolution, keyframe_resolution
331
+
332
+ keyframe_paths_for_video = []
333
+ scene_history = "N/A"
334
+
335
+ wrapper_prompt_path = os.path.join(os.path.dirname(__file__), "prompts/flux_composition_wrapper_prompt.txt")
336
+ with open(wrapper_prompt_path, "r", encoding="utf-8") as f:
337
+ kontext_template = f.read()
338
+
339
+ director_prompt_path = os.path.join(os.path.dirname(__file__), "prompts/director_composition_prompt.txt")
340
+ with open(director_prompt_path, "r", encoding="utf-8") as f:
341
+ director_template = f.read()
342
+
343
  try:
344
+ genai.configure(api_key=GEMINI_API_KEY)
345
+ model = genai.GenerativeModel('gemini-2.5-flash')
346
+
347
  for i, scene_description in enumerate(storyboard):
348
+ progress(i / len(storyboard), desc=f"Compondo Keyframe {i+1}/{len(storyboard)} ({width}x{height})")
349
+ log_history += f"\n--- COMPONDO KEYFRAME {i+1}/{len(storyboard)} ---\n"
350
+
351
+ last_three_paths = ([initial_ref_image_path] + keyframe_paths_for_video)[-3:]
352
+
353
+ log_history += f" - Diretor de Cena estΓ‘ analisando o contexto...\n"
354
+ yield {keyframe_log_output: gr.update(value=log_history), keyframe_gallery_output: gr.update(value=generated_images_for_gallery), keyframe_images_state: gr.update(value=generated_images_for_gallery)}
355
+
356
+ director_prompt = director_template.format(
357
+ global_prompt=global_prompt,
358
+ scene_history=scene_history,
359
+ current_scene_desc=scene_description,
360
+ )
361
+
362
+ model_contents = []
363
+ image_map = {}
364
+ current_image_index = 1
365
+
366
+ for path in last_three_paths:
367
+ if path not in image_map.values():
368
+ image_map[current_image_index] = path
369
+ model_contents.extend([f"IMG-{current_image_index}:", Image.open(path)])
370
+ current_image_index += 1
371
+
372
+ for path in fixed_reference_paths:
373
+ if path not in image_map.values():
374
+ image_map[current_image_index] = path
375
+ model_contents.extend([f"IMG-{current_image_index}:", Image.open(path)])
376
+ current_image_index += 1
377
+
378
+ model_contents.append(director_prompt)
379
+
380
+ response_text = model.generate_content(model_contents).text
381
+ composition_prompt_with_tags = response_text.strip()
382
+
383
+ referenced_indices = [int(idx) for idx in re.findall(r'\[IMG-(\d+)\]', composition_prompt_with_tags)]
384
+
385
+ current_reference_paths = [image_map[idx] for idx in sorted(list(set(referenced_indices))) if idx in image_map]
386
+ if not current_reference_paths:
387
+ current_reference_paths = [last_three_paths[-1]]
388
+
389
+ reference_images_pil = [Image.open(p) for p in current_reference_paths]
390
+ final_kontext_prompt = re.sub(r'\[IMG-\d+\]', '', composition_prompt_with_tags).strip()
391
+
392
+ log_history += f" - Diretor de Cena decidiu usar as imagens: {[os.path.basename(p) for p in current_reference_paths]}\n"
393
+ log_history += f" - Prompt Final do Diretor: \"{final_kontext_prompt}\"\n"
394
+ scene_history += f"Scene {i+1}: {final_kontext_prompt}\n"
395
+
396
+ yield {keyframe_log_output: gr.update(value=log_history), keyframe_gallery_output: gr.update(value=generated_images_for_gallery), keyframe_images_state: gr.update(value=generated_images_for_gallery)}
397
+
398
+ final_kontext_prompt_wrapped = kontext_template.format(target_prompt=final_kontext_prompt)
399
  output_path = os.path.join(WORKSPACE_DIR, f"keyframe_{i+1}.png")
400
+
401
+ image = flux_kontext_singleton.generate_image(
402
+ reference_images=reference_images_pil,
403
+ prompt=final_kontext_prompt_wrapped,
404
+ width=width, height=height, seed=int(time.time())
405
+ )
406
+
407
  image.save(output_path)
408
+ keyframe_paths_for_video.append(output_path)
409
+ generated_images_for_gallery.append(output_path)
410
+
411
+ except Exception as e:
412
+ raise gr.Error(f"O Compositor (FluxKontext) ou o Diretor de Cena (Gemini) falhou: {e}")
413
+
414
+ log_history += "\nComposiΓ§Γ£o de todos os keyframes concluΓ­da.\n"
415
+ final_keyframes = keyframe_paths_for_video
416
+ yield {keyframe_log_output: gr.update(value=log_history), keyframe_gallery_output: final_keyframes, keyframe_images_state: final_keyframes}
 
417
 
418
  def get_initial_motion_prompt(user_prompt: str, start_image_path: str, destination_image_path: str, dest_scene_desc: str):
419
+ """
420
+ Chama a IA (Gemini) para atuar como "Cineasta Inicial".
421
+ Gera o prompt de movimento para o primeiro fragmento de vΓ­deo, que nΓ£o possui um eco anterior.
422
+
423
+ Args:
424
+ user_prompt (str): A ideia geral da histΓ³ria.
425
+ start_image_path (str): Caminho para o primeiro keyframe.
426
+ destination_image_path (str): Caminho para o segundo keyframe.
427
+ dest_scene_desc (str): A descriΓ§Γ£o do roteiro para a cena de destino.
428
+
429
+ Returns:
430
+ str: O prompt de movimento gerado.
431
+ """
432
  if not GEMINI_API_KEY: raise gr.Error("Chave da API Gemini nΓ£o configurada!")
433
  try:
434
+ genai.configure(api_key=GEMINI_API_KEY); model = genai.GenerativeModel('gemini-2.5-flash'); prompt_file = "prompts/initial_motion_prompt.txt"
435
  with open(os.path.join(os.path.dirname(__file__), prompt_file), "r", encoding="utf-8") as f: template = f.read()
436
  cinematographer_prompt = template.format(user_prompt=user_prompt, destination_scene_description=dest_scene_desc)
437
  start_img, dest_img = Image.open(start_image_path), Image.open(destination_image_path)
 
440
  return response.text.strip()
441
  except Exception as e: raise gr.Error(f"O Cineasta de IA (Inicial) falhou: {e}. Resposta: {getattr(e, 'text', 'No text available.')}")
442
 
443
+ def get_transition_decision(user_prompt, story_history, memory_media_path, path_image_path, destination_image_path, midpoint_scene_description, dest_scene_desc):
444
+ """
445
+ Chama a IA (Gemini) para atuar como "Diretor de Continuidade".
446
+ Analisa o eco, o keyframe atual e o prΓ³ximo para decidir entre uma transiΓ§Γ£o contΓ­nua
447
+ ou um corte de cena, e gera o prompt de movimento apropriado.
448
+
449
+ Args:
450
+ (VΓ‘rios argumentos de contexto sobre a histΓ³ria e as imagens)
451
+
452
+ Returns:
453
+ dict: Um dicionΓ‘rio contendo 'transition_type' e 'motion_prompt'.
454
+ """
455
  if not GEMINI_API_KEY: raise gr.Error("Chave da API Gemini nΓ£o configurada!")
456
  try:
457
+ genai.configure(api_key=GEMINI_API_KEY); model = genai.GenerativeModel('gemini-2.5-flash'); prompt_file = "prompts/transition_decision_prompt.txt"
458
  with open(os.path.join(os.path.dirname(__file__), prompt_file), "r", encoding="utf-8") as f: template = f.read()
459
+ continuity_prompt = template.format(user_prompt=user_prompt, story_history=story_history, midpoint_scene_description=midpoint_scene_description, destination_scene_description=dest_scene_desc)
460
+ with imageio.get_reader(memory_media_path) as reader: mem_img = Image.fromarray(reader.get_data(0))
 
461
  path_img, dest_img = Image.open(path_image_path), Image.open(destination_image_path)
462
+ model_contents = ["START Image (from Kinetic Echo):", mem_img, "MIDPOINT Image (Path):", path_img, "DESTINATION Image (Destination):", dest_img, continuity_prompt]
463
  response = model.generate_content(model_contents)
464
+ decision_data = robust_json_parser(response.text)
465
+ if "transition_type" not in decision_data or "motion_prompt" not in decision_data: raise ValueError("A resposta da IA nΓ£o contΓ©m as chaves 'transition_type' ou 'motion_prompt'.")
466
+ return decision_data
467
+ except Exception as e: raise gr.Error(f"O Diretor de Continuidade (IA) falhou: {e}. Resposta: {getattr(e, 'text', str(e))}")
468
 
 
469
  def run_video_production(
470
+ video_resolution,
471
  video_duration_seconds, video_fps, eco_video_frames, use_attention_slicing,
472
+ fragment_duration_frames, mid_cond_strength, dest_cond_strength, num_inference_steps,
473
+ decode_timestep, image_cond_noise_scale,
474
+ prompt_geral, keyframe_images_state, scene_storyboard, cfg,
475
  progress=gr.Progress()
476
  ):
477
+ """
478
+ Orquestra a Etapa 3: A ProduΓ§Γ£o.
479
+ Itera sobre os keyframes e chama os cineastas de IA para gerar os fragmentos de vΓ­deo.
480
+
481
+ Args:
482
+ (VΓ‘rios parΓ’metros da UI para controlar a geraΓ§Γ£o de vΓ­deo)
483
+
484
+ Yields:
485
+ dict: AtualizaΓ§Γ΅es para os componentes da UI do Gradio.
486
+ """
 
 
 
 
 
 
 
 
487
  try:
488
+ valid_keyframes = [p for p in keyframe_images_state if p is not None and os.path.exists(p)]
489
+ width, height = video_resolution, video_resolution
490
+ video_total_frames_user = int(video_duration_seconds * video_fps)
491
+ video_total_frames_ltx = int(round((float(video_total_frames_user) - 1.0) / 8.0) * 8 + 1)
492
+ if not valid_keyframes or len(valid_keyframes) < 2: raise gr.Error("SΓ£o necessΓ‘rios pelo menos 2 keyframes vΓ‘lidos para produzir uma transiΓ§Γ£o.")
493
+ if int(fragment_duration_frames) > video_total_frames_user: raise gr.Error(f"DuraΓ§Γ£o do fragmento ({fragment_duration_frames}) nΓ£o pode ser maior que a DuraΓ§Γ£o Bruta ({video_total_frames_user}).")
494
+ log_history = f"\n--- FASE 3/4: Iniciando ProduΓ§Γ£o ({width}x{height})...\n"
495
+ yield {
496
+ production_log_output: log_history, video_gallery_output: [],
497
+ prod_media_start_output: None, prod_media_mid_output: gr.update(visible=False), prod_media_end_output: None
498
+ }
499
+ seed = int(time.time()); video_fragments, story_history = [], ""; kinetic_memory_path = None
500
+ num_transitions = len(valid_keyframes) - 1
501
 
 
502
  for i in range(num_transitions):
503
  fragment_num = i + 1
504
+ progress(i / num_transitions, desc=f"Gerando Fragmento {fragment_num}...")
505
  log_history += f"\n--- FRAGMENTO {fragment_num}/{num_transitions} ---\n"
506
+ destination_frame = int(video_total_frames_ltx - 1)
507
 
508
+ if i == 0 or kinetic_memory_path is None:
509
+ start_path, destination_path = valid_keyframes[i], valid_keyframes[i+1]
510
+ dest_scene_desc = scene_storyboard[i]
511
+ log_history += f" - InΓ­cio (Cena Nova): {os.path.basename(start_path)}\n - Destino: {os.path.basename(destination_path)}\n"
512
  current_motion_prompt = get_initial_motion_prompt(prompt_geral, start_path, destination_path, dest_scene_desc)
513
+ conditioning_items_data = [(start_path, 0, 1.0), (destination_path, destination_frame, dest_cond_strength)]
514
+ transition_type = "continuous"
515
+ yield { production_log_output: log_history, prod_media_start_output: start_path, prod_media_mid_output: gr.update(visible=False), prod_media_end_output: destination_path }
 
 
 
 
516
  else:
517
+ memory_path, path_path, destination_path = kinetic_memory_path, valid_keyframes[i], valid_keyframes[i+1]
518
+ path_scene_desc, dest_scene_desc = scene_storyboard[i-1], scene_storyboard[i]
519
+ log_history += f" - Diretor de Continuidade analisando...\n - MemΓ³ria: {os.path.basename(memory_path)}\n - Caminho: {os.path.basename(path_path)}\n - Destino: {os.path.basename(destination_path)}\n"
520
+ yield { production_log_output: log_history, prod_media_start_output: gr.update(value=memory_path, visible=True), prod_media_mid_output: gr.update(value=path_path, visible=True), prod_media_end_output: destination_path }
521
+ decision_data = get_transition_decision(prompt_geral, story_history, memory_path, path_path, destination_path, midpoint_scene_description=path_scene_desc, dest_scene_desc=dest_scene_desc)
522
+ transition_type = decision_data["transition_type"]
523
+ current_motion_prompt = decision_data["motion_prompt"]
524
+ log_history += f" - DecisΓ£o: {transition_type.upper()}\n"
525
+ mid_cond_frame_calculated = int(video_total_frames_ltx - fragment_duration_frames + eco_video_frames)
526
+ conditioning_items_data = [(memory_path, 0, 1.0), (path_path, mid_cond_frame_calculated, mid_cond_strength), (destination_path, destination_frame, dest_cond_strength)]
 
 
 
527
 
528
  story_history += f"\n- Ato {fragment_num + 1}: {current_motion_prompt}"
529
  log_history += f" - InstruΓ§Γ£o do Cineasta: '{current_motion_prompt}'\n"; yield {production_log_output: log_history}
530
 
531
+ output_filename = f"fragment_{fragment_num}_{transition_type}.mp4"
532
+ full_fragment_path, _ = ltx_manager_singleton.generate_video_fragment(
533
+ motion_prompt=current_motion_prompt, conditioning_items_data=conditioning_items_data,
534
+ width=width, height=height, seed=seed, cfg=cfg, progress=progress,
535
+ video_total_frames=video_total_frames_ltx, video_fps=video_fps,
536
+ use_attention_slicing=use_attention_slicing, num_inference_steps=num_inference_steps,
537
+ decode_timestep=decode_timestep, image_cond_noise_scale=image_cond_noise_scale,
538
+ current_fragment_index=fragment_num, output_path=os.path.join(WORKSPACE_DIR, output_filename)
539
  )
540
+ log_history += f" - LOG: Gerei {output_filename}.\n"
541
+
 
 
 
 
542
  is_last_fragment = (i == num_transitions - 1)
543
+
544
+ if is_last_fragment:
545
+ log_history += " - Último fragmento. Mantendo duração total.\n"
546
+ video_fragments.append(full_fragment_path)
547
+ kinetic_memory_path = None
548
+ elif transition_type == "cut":
549
+ log_history += " - CORTE DE CENA: Fragmento mantido, memΓ³ria reiniciada.\n"
550
+ video_fragments.append(full_fragment_path)
551
+ kinetic_memory_path = None
552
+ else:
553
+ trimmed_fragment_path = os.path.join(WORKSPACE_DIR, f"fragment_{fragment_num}_trimmed.mp4")
554
+ trim_video_to_frames(full_fragment_path, trimmed_fragment_path, int(fragment_duration_frames))
555
  eco_output_path = os.path.join(WORKSPACE_DIR, f"eco_from_frag_{fragment_num}.mp4")
556
  kinetic_memory_path = extract_last_n_frames_as_video(trimmed_fragment_path, eco_output_path, int(eco_video_frames))
557
+ video_fragments.append(full_fragment_path)
558
+ log_history += f" - CONTINUIDADE: Eco criado: {os.path.basename(kinetic_memory_path)}\n"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
559
 
560
+ yield {production_log_output: log_history, video_gallery_output: video_fragments}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
561
 
562
+ progress(1.0, desc="ProduΓ§Γ£o dos fragmentos concluΓ­da.")
563
+ log_history += "\nProduΓ§Γ£o de todos os fragmentos concluΓ­da. Pronto para montar o vΓ­deo final.\n"
564
+ yield {
565
+ production_log_output: log_history,
566
+ video_gallery_output: video_fragments,
567
+ fragment_list_state: video_fragments
568
+ }
569
+ except Exception as e: raise gr.Error(f"A ProduΓ§Γ£o de VΓ­deo (LTX) falhou: {e}")
570
 
571
+ # ======================================================================================
572
+ # SEÇÃO 3: DEFINIÇÃO DA INTERFACE GRÁFICA (UI com Gradio)
573
+ # ======================================================================================
 
 
 
 
 
 
 
 
 
 
 
574
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
575
  with gr.Blocks(theme=gr.themes.Soft()) as demo:
576
+ gr.Markdown(f"# NOVIM-13.1 (Painel de Controle do Diretor)\n*Arquitetura ADUC-SDR com DocumentaΓ§Γ£o Completa*")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
577
 
578
  if os.path.exists(WORKSPACE_DIR): shutil.rmtree(WORKSPACE_DIR)
579
  os.makedirs(WORKSPACE_DIR); Path("prompts").mkdir(exist_ok=True)
580
+
581
+ # --- DefiniΓ§Γ£o dos Estados da UI ---
582
+ scene_storyboard_state = gr.State([])
583
+ keyframe_images_state = gr.State([])
584
+ fragment_list_state = gr.State([])
585
+ prompt_geral_state = gr.State("")
586
+ processed_ref_paths_state = gr.State([])
587
+ fragment_duration_state = gr.State()
588
+ eco_frames_state = gr.State()
589
+
590
+ # --- Layout da UI ---
591
+ gr.Markdown("## CONFIGURAÇÕES GLOBAIS DE RESOLUÇÃO")
592
+ with gr.Row():
593
+ video_resolution_selector = gr.Radio([512, 720, 1024], value=512, label="ResoluΓ§Γ£o de GeraΓ§Γ£o do VΓ­deo (px)")
594
+ keyframe_resolution_selector = gr.Radio([512, 720, 1024], value=512, label="ResoluΓ§Γ£o dos Keyframes (px)")
595
 
596
  gr.Markdown("--- \n ## ETAPA 1: O ROTEIRO (IA Roteirista)")
597
  with gr.Row():
598
  with gr.Column(scale=1):
599
  prompt_input = gr.Textbox(label="Ideia Geral (Prompt)")
600
+ num_fragments_input = gr.Slider(2, 50, 4, step=1, label="NΒΊ de Keyframes a Gerar")
601
+ reference_gallery = gr.Gallery(
602
+ label="Imagens de ReferΓͺncia (A primeira Γ© a principal)",
603
+ type="filepath",
604
+ columns=4, rows=1, object_fit="contain", height="auto"
605
+ )
606
  director_button = gr.Button("▢️ 1. Gerar Roteiro", variant="primary")
607
  with gr.Column(scale=2): storyboard_to_show = gr.JSON(label="Roteiro de Cenas Gerado (em InglΓͺs)")
608
 
609
+ gr.Markdown("--- \n ## ETAPA 2: OS KEYFRAMES (IA Compositor & Diretor de Cena)")
610
  with gr.Row():
611
  with gr.Column(scale=2):
612
+ gr.Markdown("O Diretor de Cena IA irΓ‘ analisar as referΓͺncias e o roteiro para compor cada keyframe de forma autΓ΄noma.")
613
+ photographer_button = gr.Button("▢️ 2. Compor Imagens-Chave em Cadeia", variant="primary")
614
+ keyframe_gallery_output = gr.Gallery(label="Galeria de Keyframes Gerados", object_fit="contain", height="auto", type="filepath", interactive=False)
615
+ with gr.Column(scale=1):
616
+ keyframe_log_output = gr.Textbox(label="DiΓ‘rio de Bordo do Compositor", lines=25, interactive=False)
 
 
 
 
 
 
 
617
 
618
  gr.Markdown("--- \n ## ETAPA 3: A PRODUÇÃO (IA Cineasta & CΓ’mera)")
619
  with gr.Row():
620
  with gr.Column(scale=1):
621
+ cfg_slider = gr.Slider(0.5, 10.0, 1.0, step=0.1, label="CFG (Guidance Scale)")
622
  with gr.Accordion("Controles AvanΓ§ados de Timing e Performance", open=False):
623
+ video_duration_slider = gr.Slider(label="DuraΓ§Γ£o da GeraΓ§Γ£o Bruta (s)", minimum=2.0, maximum=10.0, value=6.0, step=0.5)
624
+ video_fps_radio = gr.Radio(choices=[8, 16, 24, 32], value=24, label="FPS do VΓ­deo")
625
+ num_inference_steps_slider = gr.Slider(label="Etapas de InferΓͺncia", minimum=4, maximum=20, value=10, step=1)
626
  slicing_checkbox = gr.Checkbox(label="Usar Attention Slicing (Economiza VRAM)", value=True)
627
  gr.Markdown("---"); gr.Markdown("#### Controles de DuraΓ§Γ£o (Arquitetura Eco + DΓ©jΓ  Vu)")
628
+ fragment_duration_slider = gr.Slider(label="DuraΓ§Γ£o de Cada Fragmento (% da GeraΓ§Γ£o Bruta)", minimum=1, maximum=100, value=75, step=1)
629
  eco_frames_slider = gr.Slider(label="Tamanho do Eco CinΓ©tico (Frames)", minimum=4, maximum=48, value=8, step=1)
630
  mid_cond_strength_slider = gr.Slider(label="ForΓ§a do 'Caminho'", minimum=0.1, maximum=1.0, value=0.5, step=0.05)
631
+ dest_cond_strength_slider = gr.Slider(label="ForΓ§a do 'Destino'", minimum=0.1, maximum=1.0, value=1.0, step=0.05)
632
+ gr.Markdown("---"); gr.Markdown("#### Controles do VAE (AvanΓ§ado)")
633
+ decode_timestep_slider = gr.Slider(label="VAE Decode Timestep", minimum=0.0, maximum=0.2, value=0.05, step=0.005)
634
+ image_cond_noise_scale_slider = gr.Slider(label="VAE Image Cond Noise Scale", minimum=0.0, maximum=0.1, value=0.025, step=0.005)
635
+
636
+ animator_button = gr.Button("▢️ 3. Produzir Cenas", variant="primary")
 
 
 
 
637
  with gr.Accordion("VisualizaΓ§Γ£o das MΓ­dias de Condicionamento (Ao Vivo)", open=True):
638
  with gr.Row():
639
  prod_media_start_output = gr.Video(label="MΓ­dia Inicial (Eco/K1)", interactive=False)
640
  prod_media_mid_output = gr.Image(label="MΓ­dia do Caminho (K_i-1)", interactive=False, visible=False)
641
  prod_media_end_output = gr.Image(label="MΓ­dia de Destino (K_i)", interactive=False)
642
  production_log_output = gr.Textbox(label="DiΓ‘rio de Bordo da ProduΓ§Γ£o", lines=10, interactive=False)
643
+ with gr.Column(scale=1): video_gallery_output = gr.Gallery(label="Fragmentos Gerados", object_fit="contain", height="auto", type="video")
644
+
645
+ gr.Markdown(f"--- \n ## ETAPA 4: PΓ“S-PRODUÇÃO (Montagem Final)")
646
+ with gr.Row():
647
+ with gr.Column():
648
+ editor_button = gr.Button("▢️ 4. Montar VΓ­deo Final", variant="primary")
649
+ final_video_output = gr.Video(label="A Obra-Prima Final")
 
650
 
651
  gr.Markdown(
652
  """
653
  ---
654
+ ### A Arquitetura: ADUC-SDR
655
+ **ADUC (Arquitetura de UnificaΓ§Γ£o Compositiva):** O sistema nΓ£o usa um ΓΊnico modelo, mas uma equipe de IAs especializadas. Um **Roteirista** cria a histΓ³ria. Um **Diretor de Cena** decide a composiΓ§Γ£o de cada keyframe, selecionando elementos de um "Γ‘lbum" de referΓͺncias visuais. Um **Compositor** (`FluxKontext`) cria as imagens.
656
+
657
+ **SDR (Escala DinΓ’mica e Resiliente):** A geraΓ§Γ£o de vΓ­deo Γ© dividida em fragmentos, permitindo criar vΓ­deos de longa duraΓ§Γ£o. A continuidade Γ© garantida pela arquitetura **Eco + DΓ©jΓ  Vu**:
658
+ - **O Eco:** Os ΓΊltimos frames de um clipe sΓ£o passados para o prΓ³ximo, transferindo o *momentum* fΓ­sico e a iluminaΓ§Γ£o.
659
+ - **O DΓ©jΓ  Vu:** Uma IA **Cineasta** analisa o Eco e os keyframes futuros para criar uma instruΓ§Γ£o de movimento que seja ao mesmo tempo contΓ­nua e narrativamente coerente, sabendo atΓ© quando realizar um corte de cena.
660
  """
661
  )
662
+ # --- LΓ³gica de ConexΓ£o dos Componentes ---
663
+ def process_and_run_storyboard(num_fragments, prompt, gallery_files, keyframe_resolution):
664
+ if not gallery_files:
665
+ raise gr.Error("Por favor, suba pelo menos uma imagem de referΓͺncia na galeria.")
666
+
667
+ raw_paths = [item[0] for item in gallery_files]
668
+ processed_paths = []
669
+ for i, path in enumerate(raw_paths):
670
+ filename = f"processed_ref_{i}_{keyframe_resolution}x{keyframe_resolution}.png"
671
+ processed_path = process_image_to_square(path, keyframe_resolution, filename)
672
+ processed_paths.append(processed_path)
673
+
674
+ storyboard = run_storyboard_generation(num_fragments, prompt, processed_paths)
675
+ return storyboard, prompt, processed_paths
676
 
677
  director_button.click(
678
+ fn=process_and_run_storyboard,
679
+ inputs=[num_fragments_input, prompt_input, reference_gallery, keyframe_resolution_selector],
680
+ outputs=[scene_storyboard_state, prompt_geral_state, processed_ref_paths_state]
681
+ ).success(fn=lambda s: s, inputs=[scene_storyboard_state], outputs=[storyboard_to_show])
682
+
683
+ photographer_button.click(
684
+ fn=run_keyframe_generation,
685
+ inputs=[scene_storyboard_state, processed_ref_paths_state, keyframe_resolution_selector, prompt_geral_state],
 
 
 
686
  outputs=[keyframe_log_output, keyframe_gallery_output, keyframe_images_state]
687
  )
688
+
689
+ def updated_animator_click(
690
+ video_resolution,
691
+ video_duration_seconds, video_fps, eco_video_frames, use_attention_slicing,
692
+ fragment_duration_percentage, mid_cond_strength, dest_cond_strength, num_inference_steps,
693
+ decode_timestep, image_cond_noise_scale,
694
+ prompt_geral, keyframe_images_state, scene_storyboard, cfg, progress=gr.Progress()):
695
+
696
+ total_frames = video_duration_seconds * video_fps
697
+ fragment_duration_in_frames = int(math.floor((fragment_duration_percentage / 100.0) * total_frames))
698
+ fragment_duration_in_frames = max(1, fragment_duration_in_frames)
699
+
700
+ for update in run_video_production(
701
+ video_resolution,
702
+ video_duration_seconds, video_fps, eco_video_frames, use_attention_slicing,
703
+ fragment_duration_in_frames, mid_cond_strength, dest_cond_strength, num_inference_steps,
704
+ decode_timestep, image_cond_noise_scale,
705
+ prompt_geral, keyframe_images_state, scene_storyboard, cfg, progress):
706
+ yield update
707
+
708
+ yield {
709
+ fragment_duration_state: fragment_duration_in_frames,
710
+ eco_frames_state: eco_video_frames
711
+ }
712
 
713
  animator_button.click(
714
+ fn=updated_animator_click,
 
 
 
 
715
  inputs=[
716
+ video_resolution_selector,
717
+ video_duration_slider, video_fps_radio, eco_frames_slider, slicing_checkbox,
718
+ fragment_duration_slider, mid_cond_strength_slider, dest_cond_strength_slider, num_inference_steps_slider,
719
+ decode_timestep_slider, image_cond_noise_scale_slider,
720
  prompt_geral_state, keyframe_images_state, scene_storyboard_state, cfg_slider
721
  ],
722
  outputs=[
723
+ production_log_output, video_gallery_output, fragment_list_state,
724
+ prod_media_start_output, prod_media_mid_output, prod_media_end_output,
725
+ fragment_duration_state, eco_frames_state
726
  ]
727
  )
728
 
729
  editor_button.click(
730
+ fn=concatenate_final_video,
731
+ inputs=[fragment_list_state, fragment_duration_state, eco_frames_state],
732
  outputs=[final_video_output]
733
  )
734
 
735
  if __name__ == "__main__":
736
+ if os.path.exists(WORKSPACE_DIR): shutil.rmtree(WORKSPACE_DIR)
737
+ os.makedirs(WORKSPACE_DIR); Path("prompts").mkdir(exist_ok=True)
738
+
739
  demo.queue().launch(server_name="0.0.0.0", share=True)
flux_kontext_helpers.py ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # flux_kontext_helpers.py
2
+ # MΓ³dulo de serviΓ§o para o FluxKontext, com gestΓ£o de memΓ³ria atΓ΄mica.
3
+ # Este arquivo Γ© parte do projeto Euia-AducSdr e estΓ‘ sob a licenΓ§a AGPL v3.
4
+ # Copyright (C) 4 de Agosto de 2025 Carlos Rodrigues dos Santos
5
+
6
+ import torch
7
+ from PIL import Image
8
+ import gc
9
+ from diffusers import FluxKontextPipeline
10
+ import huggingface_hub
11
+ import os
12
+
13
+ class Generator:
14
+ def __init__(self, device_id='cuda:0'):
15
+ self.cpu_device = torch.device('cpu')
16
+ self.gpu_device = torch.device(device_id if torch.cuda.is_available() else 'cpu')
17
+ print(f"WORKER COMPOSITOR: Usando dispositivo: {self.gpu_device}")
18
+ self.pipe = None
19
+ self._load_pipe_to_cpu()
20
+
21
+ def _load_pipe_to_cpu(self):
22
+ if self.pipe is None:
23
+ print("WORKER COMPOSITOR: Carregando modelo FluxKontext para a CPU...")
24
+ self.pipe = FluxKontextPipeline.from_pretrained(
25
+ "black-forest-labs/FLUX.1-Kontext-dev", torch_dtype=torch.bfloat16
26
+ ).to(self.cpu_device)
27
+ print("WORKER COMPOSITOR: Modelo FluxKontext pronto (na CPU).")
28
+
29
+ def to_gpu(self):
30
+ if self.gpu_device.type == 'cpu': return
31
+ print(f"WORKER COMPOSITOR: Movendo modelo para {self.gpu_device}...")
32
+ self.pipe.to(self.gpu_device)
33
+ print(f"WORKER COMPOSITOR: Modelo na GPU {self.gpu_device}.")
34
+
35
+ def to_cpu(self):
36
+ if self.gpu_device.type == 'cpu': return
37
+ print(f"WORKER COMPOSITOR: Descarregando modelo da GPU {self.gpu_device}...")
38
+ self.pipe.to(self.cpu_device)
39
+ gc.collect()
40
+ if torch.cuda.is_available():
41
+ torch.cuda.empty_cache()
42
+
43
+ def _concatenate_images(self, images, direction="horizontal"):
44
+ if not images: return None
45
+ valid_images = [img.convert("RGB") for img in images if img is not None]
46
+ if not valid_images: return None
47
+ if len(valid_images) == 1: return valid_images[0]
48
+
49
+ if direction == "horizontal":
50
+ total_width = sum(img.width for img in valid_images)
51
+ max_height = max(img.height for img in valid_images)
52
+ concatenated = Image.new('RGB', (total_width, max_height))
53
+ x_offset = 0
54
+ for img in valid_images:
55
+ y_offset = (max_height - img.height) // 2
56
+ concatenated.paste(img, (x_offset, y_offset))
57
+ x_offset += img.width
58
+ else:
59
+ max_width = max(img.width for img in valid_images)
60
+ total_height = sum(img.height for img in valid_images)
61
+ concatenated = Image.new('RGB', (max_width, total_height))
62
+ y_offset = 0
63
+ for img in valid_images:
64
+ x_offset = (max_width - img.width) // 2
65
+ concatenated.paste(img, (x_offset, y_offset))
66
+ y_offset += img.height
67
+ return concatenated
68
+
69
+ @torch.inference_mode()
70
+ def generate_image(self, reference_images, prompt, width, height, seed=42):
71
+ try:
72
+ self.to_gpu()
73
+
74
+ concatenated_image = self._concatenate_images(reference_images, "horizontal")
75
+ if concatenated_image is None:
76
+ raise ValueError("Nenhuma imagem de referΓͺncia vΓ‘lida foi fornecida.")
77
+
78
+ # ### CORREÇÃO ###
79
+ # Ignora o tamanho da imagem concatenada e usa os parΓ’metros `width` e `height` fornecidos.
80
+ image = self.pipe(
81
+ image=concatenated_image,
82
+ prompt=prompt,
83
+ guidance_scale=2.5,
84
+ width=width,
85
+ height=height,
86
+ generator=torch.Generator(device="cpu").manual_seed(seed)
87
+ ).images[0]
88
+
89
+ return image
90
+ finally:
91
+ self.to_cpu()
92
+
93
+ # --- InstΓ’ncia Singleton ---
94
+ print("Inicializando o Compositor de Cenas (FluxKontext)...")
95
+ hf_token = os.getenv('HF_TOKEN')
96
+ if hf_token: huggingface_hub.login(token=hf_token)
97
+ flux_kontext_singleton = Generator(device_id='cuda:0')
98
+ print("Compositor de Cenas pronto.")
ltx_helpers.py ADDED
@@ -0,0 +1,190 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ltx_manager_helpers.py
2
+ # Gerente de Pool de Workers LTX para revezamento assΓ­ncrono em mΓΊltiplas GPUs.
3
+ # Este arquivo Γ© parte do projeto Euia-AducSdr e estΓ‘ sob a licenΓ§a AGPL v3.
4
+ # Copyright (C) 4 de Agosto de 2025 Carlos Rodrigues dos Santos
5
+
6
+ import torch
7
+ import gc
8
+ import os
9
+ import yaml
10
+ import numpy as np
11
+ import imageio
12
+ from pathlib import Path
13
+ import huggingface_hub
14
+ import threading
15
+ from PIL import Image
16
+
17
+ # Importa as funΓ§Γ΅es e classes necessΓ‘rias do inference.py
18
+ from inference import (
19
+ create_ltx_video_pipeline,
20
+ ConditioningItem,
21
+ calculate_padding,
22
+ prepare_conditioning
23
+ )
24
+
25
+ class LtxWorker:
26
+ """
27
+ Representa uma ΓΊnica instΓ’ncia do pipeline LTX, associada a uma GPU especΓ­fica.
28
+ O pipeline Γ© carregado na CPU por padrΓ£o e movido para a GPU sob demanda.
29
+ """
30
+ def __init__(self, device_id='cuda:0'):
31
+ self.device = torch.device(device_id if torch.cuda.is_available() else 'cpu')
32
+ print(f"LTX Worker: Inicializando para o dispositivo {self.device} (carregando na CPU)...")
33
+
34
+ config_file_path = "configs/ltxv-13b-0.9.8-distilled.yaml"
35
+ with open(config_file_path, "r") as file:
36
+ self.config = yaml.safe_load(file)
37
+
38
+ LTX_REPO = "Lightricks/LTX-Video"
39
+ models_dir = "downloaded_models_gradio"
40
+
41
+ distilled_model_actual_path = huggingface_hub.hf_hub_download(
42
+ repo_id=LTX_REPO,
43
+ filename=self.config["checkpoint_path"],
44
+ local_dir=models_dir,
45
+ local_dir_use_symlinks=False
46
+ )
47
+
48
+ self.pipeline = create_ltx_video_pipeline(
49
+ ckpt_path=distilled_model_actual_path,
50
+ precision=self.config["precision"],
51
+ text_encoder_model_name_or_path=self.config["text_encoder_model_name_or_path"],
52
+ sampler=self.config["sampler"],
53
+ device='cpu'
54
+ )
55
+ print(f"LTX Worker para {self.device} pronto na CPU.")
56
+
57
+ def to_gpu(self):
58
+ """Move o pipeline para a GPU designada."""
59
+ if self.device.type == 'cpu': return
60
+ print(f"LTX Worker: Movendo pipeline para {self.device}...")
61
+ self.pipeline.to(self.device)
62
+ print(f"LTX Worker: Pipeline na GPU {self.device}.")
63
+
64
+ def to_cpu(self):
65
+ """Move o pipeline de volta para a CPU e limpa a memΓ³ria da GPU."""
66
+ if self.device.type == 'cpu': return
67
+ print(f"LTX Worker: Descarregando pipeline da GPU {self.device}...")
68
+ self.pipeline.to('cpu')
69
+ gc.collect()
70
+ if torch.cuda.is_available():
71
+ torch.cuda.empty_cache()
72
+ print(f"LTX Worker: GPU {self.device} limpa.")
73
+
74
+ def generate_video_fragment_internal(self, **kwargs):
75
+ """A lΓ³gica real da geraΓ§Γ£o de vΓ­deo, que espera estar na GPU."""
76
+ return self.pipeline(**kwargs)
77
+
78
+ class LtxPoolManager:
79
+ """
80
+ Gerencia um pool de LtxWorkers, orquestrando um revezamento entre GPUs
81
+ para permitir que a limpeza de uma GPU ocorra em paralelo com a computaΓ§Γ£o em outra.
82
+ """
83
+ def __init__(self, device_ids=['cuda:2', 'cuda:3']):
84
+ print(f"LTX POOL MANAGER: Criando workers para os dispositivos: {device_ids}")
85
+ self.workers = [LtxWorker(device_id) for device_id in device_ids]
86
+ self.current_worker_index = 0
87
+ self.lock = threading.Lock()
88
+ self.last_cleanup_thread = None
89
+
90
+ def _cleanup_worker(self, worker):
91
+ """FunΓ§Γ£o alvo para a thread de limpeza."""
92
+ print(f"CLEANUP THREAD: Iniciando limpeza da GPU {worker.device} em background...")
93
+ worker.to_cpu()
94
+ print(f"CLEANUP THREAD: Limpeza da GPU {worker.device} concluΓ­da.")
95
+
96
+ def generate_video_fragment(
97
+ self,
98
+ motion_prompt: str, conditioning_items_data: list,
99
+ width: int, height: int, seed: int, cfg: float, video_total_frames: int,
100
+ video_fps: int, num_inference_steps: int, use_attention_slicing: bool,
101
+ current_fragment_index: int, output_path: str, progress
102
+ ):
103
+ worker_to_use = None
104
+ try:
105
+ with self.lock:
106
+ # 1. Espera a limpeza da thread anterior, se ainda estiver rodando.
107
+ if self.last_cleanup_thread and self.last_cleanup_thread.is_alive():
108
+ print("LTX POOL MANAGER: Aguardando limpeza da GPU anterior...")
109
+ self.last_cleanup_thread.join()
110
+ print("LTX POOL MANAGER: Limpeza anterior concluΓ­da.")
111
+
112
+ # 2. Seleciona o worker ATUAL para o trabalho
113
+ worker_to_use = self.workers[self.current_worker_index]
114
+
115
+ # 3. Seleciona o worker ANTERIOR para iniciar a limpeza
116
+ previous_worker_index = (self.current_worker_index - 1 + len(self.workers)) % len(self.workers)
117
+ worker_to_cleanup = self.workers[previous_worker_index]
118
+
119
+ # 4. Dispara a limpeza do worker ANTERIOR em uma nova thread
120
+ cleanup_thread = threading.Thread(target=self._cleanup_worker, args=(worker_to_cleanup,))
121
+ cleanup_thread.start()
122
+ self.last_cleanup_thread = cleanup_thread
123
+
124
+ # 5. Prepara o worker ATUAL para a computaΓ§Γ£o
125
+ worker_to_use.to_gpu()
126
+
127
+ # 6. Atualiza o Γ­ndice para a PRΓ“XIMA chamada
128
+ self.current_worker_index = (self.current_worker_index + 1) % len(self.workers)
129
+
130
+ # --- A GERAÇÃO OCORRE FORA DO LOCK ---
131
+ target_device = worker_to_use.device
132
+
133
+ if use_attention_slicing:
134
+ worker_to_use.pipeline.enable_attention_slicing()
135
+
136
+ media_paths = [item[0] for item in conditioning_items_data]
137
+ start_frames = [item[1] for item in conditioning_items_data]
138
+ strengths = [item[2] for item in conditioning_items_data]
139
+
140
+ padded_h, padded_w = ((height - 1) // 32 + 1) * 32, ((width - 1) // 32 + 1) * 32
141
+ padding_vals = calculate_padding(height, width, padded_h, padded_w)
142
+
143
+ conditioning_items = prepare_conditioning(
144
+ conditioning_media_paths=media_paths, conditioning_strengths=strengths,
145
+ conditioning_start_frames=start_frames, height=height, width=width,
146
+ num_frames=video_total_frames, padding=padding_vals, pipeline=worker_to_use.pipeline,
147
+ )
148
+
149
+ for item in conditioning_items:
150
+ item.media_item = item.media_item.to(target_device)
151
+
152
+ first_pass_config = worker_to_use.config.get("first_pass", {}).copy()
153
+ first_pass_config['num_inference_steps'] = int(num_inference_steps)
154
+
155
+ kwargs = {
156
+ "prompt": motion_prompt, "negative_prompt": "blurry, distorted, bad quality, artifacts",
157
+ "height": padded_h, "width": padded_w, "num_frames": video_total_frames,
158
+ "frame_rate": video_fps,
159
+ "generator": torch.Generator(device=target_device).manual_seed(int(seed) + current_fragment_index),
160
+ "output_type": "pt", "guidance_scale": float(cfg),
161
+ "timesteps": first_pass_config.get("timesteps"),
162
+ "conditioning_items": conditioning_items,
163
+ "decode_timestep": worker_to_use.config.get("decode_timestep"),
164
+ "decode_noise_scale": worker_to_use.config.get("decode_noise_scale"),
165
+ "stochastic_sampling": worker_to_use.config.get("stochastic_sampling"),
166
+ "image_cond_noise_scale": 0.15, "is_video": True, "vae_per_channel_normalize": True,
167
+ "mixed_precision": (worker_to_use.config.get("precision") == "mixed_precision"),
168
+ "enhance_prompt": False, "decode_every": 4, "num_inference_steps": int(num_inference_steps)
169
+ }
170
+
171
+ progress(0.1, desc=f"[CΓ’mera LTX em {worker_to_use.device}] Filmando Cena {current_fragment_index}...")
172
+ result_tensor = worker_to_use.generate_video_fragment_internal(**kwargs).images
173
+
174
+ pad_l, pad_r, pad_t, pad_b = map(int, padding_vals); slice_h = -pad_b if pad_b > 0 else None; slice_w = -pad_r if pad_r > 0 else None
175
+ cropped_tensor = result_tensor[:, :, :video_total_frames, pad_t:slice_h, pad_l:slice_w]
176
+ video_np = (cropped_tensor[0].permute(1, 2, 3, 0).cpu().float().numpy() * 255).astype(np.uint8)
177
+
178
+ with imageio.get_writer(output_path, fps=video_fps, codec='libx264', quality=8) as writer:
179
+ for frame in video_np: writer.append_data(frame)
180
+
181
+ return output_path, video_total_frames
182
+
183
+ finally:
184
+ if use_attention_slicing and worker_to_use and worker_to_use.pipeline:
185
+ worker_to_use.pipeline.disable_attention_slicing()
186
+ # A limpeza do worker_to_use serΓ‘ feita na PRΓ“XIMA chamada a esta funΓ§Γ£o.
187
+
188
+ # Singleton do Gerenciador de Pool
189
+ # Por padrΓ£o, usa cuda:2 e cuda:3. Altere aqui se necessΓ‘rio.
190
+ ltx_manager_singleton = LtxPoolManager(device_ids=['cuda:2', 'cuda:3'])
ltx_upscaler_manager_helpers.py ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ltx_upscaler_manager_helpers.py
2
+ # Gerente de Pool para o revezamento de workers de Upscaling.
3
+ # Este arquivo Γ© parte do projeto Euia-AducSdr e estΓ‘ sob a licenΓ§a AGPL v3.
4
+ # Copyright (C) 4 de Agosto de 2025 Carlos Rodrigues dos Santos
5
+
6
+ import torch
7
+ import gc
8
+ import os
9
+ import threading
10
+ from ltx_worker_upscaler import LtxUpscaler
11
+
12
+ class LtxUpscalerPoolManager:
13
+ """
14
+ Gerencia um pool de LtxUpscalerWorkers, orquestrando um revezamento entre GPUs
15
+ para a tarefa de upscaling.
16
+ """
17
+ def __init__(self, device_ids=['cuda:2', 'cuda:3']):
18
+ print(f"LTX UPSCALER POOL MANAGER: Criando workers para os dispositivos: {device_ids}")
19
+ self.workers = [LtxUpscaler(device_id) for device_id in device_ids]
20
+ self.current_worker_index = 0
21
+ self.lock = threading.Lock()
22
+ self.last_cleanup_thread = None
23
+
24
+ def _cleanup_worker(self, worker):
25
+ """FunΓ§Γ£o alvo para a thread de limpeza em background."""
26
+ print(f"UPSCALER CLEANUP THREAD: Iniciando limpeza da GPU {worker.device}...")
27
+ worker.to_cpu()
28
+ print(f"UPSCALER CLEANUP THREAD: Limpeza da GPU {worker.device} concluΓ­da.")
29
+
30
+ def upscale_video_fragment(self, video_path_low_res: str, output_path: str, video_fps: int):
31
+ """
32
+ Seleciona um worker livre, faz o upscale de um fragmento e limpa o worker anterior.
33
+ """
34
+ worker_to_use = None
35
+ try:
36
+ with self.lock:
37
+ if self.last_cleanup_thread and self.last_cleanup_thread.is_alive():
38
+ print("UPSCALER POOL MANAGER: Aguardando limpeza da GPU anterior...")
39
+ self.last_cleanup_thread.join()
40
+
41
+ worker_to_use = self.workers[self.current_worker_index]
42
+ previous_worker_index = (self.current_worker_index - 1 + len(self.workers)) % len(self.workers)
43
+ worker_to_cleanup = self.workers[previous_worker_index]
44
+
45
+ cleanup_thread = threading.Thread(target=self._cleanup_worker, args=(worker_to_cleanup,))
46
+ cleanup_thread.start()
47
+ self.last_cleanup_thread = cleanup_thread
48
+
49
+ worker_to_use.to_gpu()
50
+
51
+ self.current_worker_index = (self.current_worker_index + 1) % len(self.workers)
52
+
53
+ print(f"UPSCALER POOL MANAGER: Worker em {worker_to_use.device} iniciando upscale de {os.path.basename(video_path_low_res)}...")
54
+ worker_to_use.upscale_video_fragment(video_path_low_res, output_path, video_fps)
55
+ print(f"UPSCALER POOL MANAGER: Upscale de {os.path.basename(video_path_low_res)} concluΓ­do.")
56
+
57
+ finally:
58
+ # A limpeza do worker_to_use serΓ‘ feita na prΓ³xima chamada
59
+ pass
60
+
61
+ # --- InstΓ’ncia Singleton do Gerenciador de Upscaling ---
62
+ ltx_upscaler_manager_singleton = LtxUpscalerPoolManager(device_ids=['cuda:2', 'cuda:3'])
ltx_worker_base.py ADDED
@@ -0,0 +1,133 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ltx_worker_base.py (GPU-C: cuda:2)
2
+ # Worker para gerar os fragmentos de vΓ­deo em resoluΓ§Γ£o base.
3
+ # Este arquivo Γ© parte do projeto Euia-AducSdr e estΓ‘ sob a licenΓ§a AGPL v3.
4
+ # Copyright (C) 4 de Agosto de 2025 Carlos Rodrigues dos Santos
5
+
6
+ import torch
7
+ import gc
8
+ import os
9
+ import yaml
10
+ import numpy as np
11
+ import imageio
12
+ from pathlib import Path
13
+ import huggingface_hub
14
+
15
+ from inference import (
16
+ create_ltx_video_pipeline,
17
+ ConditioningItem,
18
+ calculate_padding,
19
+ prepare_conditioning
20
+ )
21
+
22
+ class LtxGenerator:
23
+ def __init__(self, device_id='cuda:2'):
24
+ print(f"WORKER CΓ‚MERA-BASE: Inicializando...")
25
+ self.device = torch.device(device_id if torch.cuda.is_available() else 'cpu')
26
+ print(f"WORKER CΓ‚MERA-BASE: Usando dispositivo: {self.device}")
27
+
28
+ config_file_path = "configs/ltxv-13b-0.9.8-distilled.yaml"
29
+ with open(config_file_path, "r") as file:
30
+ self.config = yaml.safe_load(file)
31
+
32
+ LTX_REPO = "Lightricks/LTX-Video"
33
+ models_dir = "downloaded_models_gradio"
34
+ Path(models_dir).mkdir(parents=True, exist_ok=True)
35
+
36
+ print("WORKER CΓ‚MERA-BASE: Carregando pipeline LTX na CPU (estado de repouso)...")
37
+ distilled_model_actual_path = huggingface_hub.hf_hub_download(
38
+ repo_id=LTX_REPO,
39
+ filename=self.config["checkpoint_path"],
40
+ local_dir=models_dir,
41
+ local_dir_use_symlinks=False
42
+ )
43
+
44
+ self.pipeline = create_ltx_video_pipeline(
45
+ ckpt_path=distilled_model_actual_path,
46
+ precision=self.config["precision"],
47
+ text_encoder_model_name_or_path=self.config["text_encoder_model_name_or_path"],
48
+ sampler=self.config["sampler"],
49
+ device='cpu'
50
+ )
51
+ print("WORKER CΓ‚MERA-BASE: Pronto (na CPU).")
52
+
53
+ def to_gpu(self):
54
+ if self.pipeline and torch.cuda.is_available():
55
+ print(f"WORKER CΓ‚MERA-BASE: Movendo LTX para {self.device}...")
56
+ self.pipeline.to(self.device)
57
+
58
+ def to_cpu(self):
59
+ if self.pipeline:
60
+ print(f"WORKER CΓ‚MERA-BASE: Descarregando LTX da GPU {self.device}...")
61
+ self.pipeline.to('cpu')
62
+ gc.collect()
63
+ if torch.cuda.is_available():
64
+ torch.cuda.empty_cache()
65
+
66
+ def generate_video_fragment(
67
+ self, motion_prompt: str, conditioning_items_data: list,
68
+ width: int, height: int, seed: int, cfg: float, video_total_frames: int,
69
+ video_fps: int, num_inference_steps: int, use_attention_slicing: bool,
70
+ current_fragment_index: int, output_path: str, progress
71
+ ):
72
+ progress(0.1, desc=f"[CΓ’mera LTX Base] Filmando Cena {current_fragment_index}...")
73
+
74
+ target_device = self.pipeline.device
75
+
76
+ if use_attention_slicing:
77
+ self.pipeline.enable_attention_slicing()
78
+
79
+ media_paths = [item[0] for item in conditioning_items_data]
80
+ start_frames = [item[1] for item in conditioning_items_data]
81
+ strengths = [item[2] for item in conditioning_items_data]
82
+
83
+ padded_h, padded_w = ((height - 1) // 32 + 1) * 32, ((width - 1) // 32 + 1) * 32
84
+ padding_vals = calculate_padding(height, width, padded_h, padded_w)
85
+
86
+ conditioning_items = prepare_conditioning(
87
+ conditioning_media_paths=media_paths, conditioning_strengths=strengths,
88
+ conditioning_start_frames=start_frames, height=height, width=width,
89
+ num_frames=video_total_frames, padding=padding_vals, pipeline=self.pipeline,
90
+ )
91
+
92
+ for item in conditioning_items:
93
+ item.media_item = item.media_item.to(target_device)
94
+
95
+ actual_num_frames = int(round((float(video_total_frames) - 1.0) / 8.0) * 8 + 1)
96
+ first_pass_config = self.config.get("first_pass", {}).copy()
97
+ first_pass_config['num_inference_steps'] = int(num_inference_steps)
98
+
99
+ kwargs = {
100
+ "prompt": motion_prompt, "negative_prompt": "blurry, distorted, bad quality, artifacts",
101
+ "height": padded_h, "width": padded_w, "num_frames": actual_num_frames,
102
+ "frame_rate": video_fps,
103
+ "generator": torch.Generator(device=target_device).manual_seed(int(seed) + current_fragment_index),
104
+ "output_type": "pt", "guidance_scale": float(cfg),
105
+ "timesteps": first_pass_config.get("timesteps"),
106
+ "conditioning_items": conditioning_items,
107
+ "decode_timestep": self.config.get("decode_timestep"),
108
+ "decode_noise_scale": self.config.get("decode_noise_scale"),
109
+ "stochastic_sampling": self.config.get("stochastic_sampling"),
110
+ "image_cond_noise_scale": 0.15, "is_video": True, "vae_per_channel_normalize": True,
111
+ "mixed_precision": (self.config.get("precision") == "mixed_precision"),
112
+ "enhance_prompt": False, "decode_every": 4, "num_inference_steps": int(num_inference_steps)
113
+ }
114
+
115
+ result_tensor = self.pipeline(**kwargs).images
116
+
117
+ pad_l, pad_r, pad_t, pad_b = map(int, padding_vals)
118
+ slice_h = -pad_b if pad_b > 0 else None; slice_w = -pad_r if pad_r > 0 else None
119
+
120
+ cropped_tensor = result_tensor[:, :, :actual_num_frames, pad_t:slice_h, pad_l:slice_w]
121
+ video_np = (cropped_tensor[0].permute(1, 2, 3, 0).cpu().float().numpy() * 255).astype(np.uint8)
122
+
123
+ with imageio.get_writer(output_path, fps=video_fps, codec='libx264', quality=8) as writer:
124
+ for frame in video_np:
125
+ writer.append_data(frame)
126
+
127
+ if use_attention_slicing and self.pipeline:
128
+ self.pipeline.disable_attention_slicing()
129
+
130
+ return output_path, actual_num_frames
131
+
132
+ # --- InstΓ’ncia Singleton para o Worker Base ---
133
+ ltx_base_singleton = LtxGenerator(device_id='cuda:2')
ltx_worker_upscaler.py ADDED
@@ -0,0 +1,99 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ltx_worker_upscaler.py (Corrigido com dtype=bfloat16)
2
+ # Worker para fazer upscale dos fragmentos de vΓ­deo para alta resoluΓ§Γ£o.
3
+ # Este arquivo Γ© parte do projeto Euia-AducSdr e estΓ‘ sob a licenΓ§a AGPL v3.
4
+ # Copyright (C) 4 de Agosto de 2025 Carlos Rodrigues dos Santos
5
+
6
+ import torch
7
+ import gc
8
+ import os
9
+ import yaml
10
+ import numpy as np
11
+ import imageio
12
+ from pathlib import Path
13
+ import huggingface_hub
14
+ from einops import rearrange
15
+
16
+ from inference import create_ltx_video_pipeline
17
+ from ltx_video.models.autoencoders.latent_upsampler import LatentUpsampler
18
+ from ltx_video.models.autoencoders.vae_encode import vae_encode, vae_decode
19
+
20
+ class LtxUpscaler:
21
+ def __init__(self, device_id='cuda:2'):
22
+ print(f"WORKER CΓ‚MERA-UPSCALER: Inicializando para {device_id}...")
23
+ self.device = torch.device(device_id if torch.cuda.is_available() else 'cpu')
24
+ self.model_dtype = torch.bfloat16 # <<<--- DEFINIR O DTYPE DO MODELO
25
+
26
+ config_file_path = "configs/ltxv-13b-0.9.8-distilled.yaml"
27
+ with open(config_file_path, "r") as file:
28
+ self.config = yaml.safe_load(file)
29
+
30
+ LTX_REPO = "Lightricks/LTX-Video"
31
+ models_dir = "downloaded_models_gradio"
32
+ Path(models_dir).mkdir(parents=True, exist_ok=True)
33
+
34
+ print(f"WORKER CΓ‚MERA-UPSCALER ({self.device}): Carregando VAE na CPU...")
35
+ distilled_model_actual_path = huggingface_hub.hf_hub_download(
36
+ repo_id=LTX_REPO, filename=self.config["checkpoint_path"],
37
+ local_dir=models_dir, local_dir_use_symlinks=False
38
+ )
39
+ temp_pipeline = create_ltx_video_pipeline(
40
+ ckpt_path=distilled_model_actual_path, precision=self.config["precision"],
41
+ text_encoder_model_name_or_path=self.config["text_encoder_model_name_or_path"],
42
+ sampler=self.config["sampler"], device='cpu'
43
+ )
44
+ self.vae = temp_pipeline.vae.to(self.model_dtype) # <<<--- CARREGA NO DTYPE CORRETO
45
+ del temp_pipeline
46
+ gc.collect()
47
+
48
+ print(f"WORKER CΓ‚MERA-UPSCALER ({self.device}): Carregando Latent Upsampler na CPU...")
49
+ upscaler_path = huggingface_hub.hf_hub_download(
50
+ repo_id=LTX_REPO, filename=self.config["spatial_upscaler_model_path"],
51
+ local_dir=models_dir, local_dir_use_symlinks=False
52
+ )
53
+ self.latent_upsampler = LatentUpsampler.from_pretrained(upscaler_path).to(self.model_dtype) # <<<--- CARREGA NO DTYPE CORRETO
54
+ self.latent_upsampler.to('cpu')
55
+
56
+ print(f"WORKER CΓ‚MERA-UPSCALER ({self.device}): Pronto (na CPU).")
57
+
58
+ def to_gpu(self):
59
+ if self.latent_upsampler and self.vae and torch.cuda.is_available():
60
+ print(f"WORKER CΓ‚MERA-UPSCALER: Movendo modelos para {self.device}...")
61
+ self.latent_upsampler.to(self.device)
62
+ self.vae.to(self.device)
63
+
64
+ def to_cpu(self):
65
+ if self.latent_upsampler and self.vae:
66
+ print(f"WORKER CΓ‚MERA-UPSCALER: Descarregando modelos da GPU {self.device}...")
67
+ self.latent_upsampler.to('cpu')
68
+ self.vae.to('cpu')
69
+ gc.collect()
70
+ if torch.cuda.is_available():
71
+ torch.cuda.empty_cache()
72
+
73
+ @torch.no_grad()
74
+ def upscale_video_fragment(self, video_path_low_res: str, output_path: str, video_fps: int):
75
+ print(f"UPSCALER ({self.device}): Processando {os.path.basename(video_path_low_res)}")
76
+
77
+ with imageio.get_reader(video_path_low_res) as reader:
78
+ video_frames = [frame for frame in reader]
79
+ video_np = np.stack(video_frames)
80
+
81
+ # <<<--- CORREÇÃO CRÍTICA AQUI ---_>>>
82
+ video_tensor = torch.from_numpy(video_np).permute(0, 3, 1, 2).float() / 255.0
83
+ video_tensor = (video_tensor * 2.0) - 1.0
84
+ video_tensor = video_tensor.unsqueeze(0).permute(0, 2, 1, 3, 4)
85
+ video_tensor = video_tensor.to(self.device, dtype=self.model_dtype) # Envia para GPU JÁ NO DTYPE CORRETO
86
+
87
+ latents = vae_encode(video_tensor, self.vae)
88
+ upsampled_latents = self.latent_upsampler(latents)
89
+ upsampled_video_tensor = vae_decode(upsampled_latents, self.vae, is_video=True)
90
+
91
+ upsampled_video_tensor = (upsampled_video_tensor.clamp(-1, 1) + 1) / 2.0
92
+ video_np_high_res = (upsampled_video_tensor[0].permute(1, 2, 3, 0).cpu().float().numpy() * 255).astype(np.uint8) # Converte de volta para float para salvar
93
+
94
+ with imageio.get_writer(output_path, fps=video_fps, codec='libx264', quality=8) as writer:
95
+ for frame in video_np_high_res:
96
+ writer.append_data(frame)
97
+
98
+ print(f"UPSCALER ({self.device}): Arquivo salvo em {os.path.basename(output_path)}")
99
+ return output_path
requirements.txt CHANGED
@@ -15,11 +15,13 @@ imageio
15
  imageio-ffmpeg
16
  einops
17
  timm
 
 
18
  av
19
- #git+https://github.com/huggingface/diffusers.git@main
20
  torch
21
  peft
22
- diffusers==0.31.0
23
- transformers==4.45.2
24
- accelerate==0.32.0
25
  git+https://github.com/ToTheBeginning/facexlib.git
 
15
  imageio-ffmpeg
16
  einops
17
  timm
18
+ safetensors
19
+
20
  av
21
+ git+https://github.com/huggingface/diffusers.git@main
22
  torch
23
  peft
24
+ #diffusers==0.31.0
25
+ transformers
26
+ accelerate
27
  git+https://github.com/ToTheBeginning/facexlib.git