Jack Monas commited on
Commit
f3b18f0
·
1 Parent(s): 3294042
Files changed (1) hide show
  1. app.py +48 -24
app.py CHANGED
@@ -223,8 +223,7 @@ def main():
223
  row_gifs = gif_paths[i:i+4]
224
 
225
  # Create columns for this row
226
- cols = st.columns(len(row_gifs))
227
-
228
  # Display each GIF in its own column
229
  for col, gif_path in zip(cols, row_gifs):
230
  col.image(gif_path, use_container_width=True)
@@ -293,28 +292,53 @@ def main():
293
  st.markdown("---")
294
 
295
  st.markdown("## FAQs")
296
- display_faq("What preprocessing steps are applied to the raw data, and can we apply our own?",
297
- "The raw data in the `world_model_raw_data` dataset includes unprocessed 512x512 MP4 video logs as collected from the EVE Android. No additional preprocessing (e.g., normalization, augmentation) is applied. You are free to apply your own preprocessing techniques—such as frame resizing, color normalization, etc.")
298
- display_faq("How is the Cosmos tokenizer used in the Tokenized Data dataset, and can we use a different tokenizer?",
299
- "The `world_model_tokenized_data dataset` uses NVIDIA’s Discrete Video 8x8x8 Cosmos Tokenizer to convert raw 256x256 video into compact tokens. For the Compression Challenge, this tokenizer is mandatory for a consistent benchmark. Alternative tokenizers are permitted for the Sampling and Evaluation Challenges")
300
- display_faq("What metrics are used to evaluate the Sampling Challenge submissions?",
301
- "Submissions are evaluated by comparing the predicted frame (2 seconds ahead) to the ground-truth frame using Peak Signal-to-Noise Ratio (PSNR).")
302
- display_faq("Can we use generative models like diffusion models or GANs for the Sampling Challenge?",
303
- "Yes, you are welcome to use generative models such as diffusion models, GANs, or autoregressive approaches for the Sampling Challenge, as long as they adhere to the rules (e.g., no use of actual future frames during inference). The challenge evaluates the quality of the predicted frame, not the method used, so feel free to experiment with cutting-edge techniques to achieve plausible and accurate predictions.")
304
- display_faq("How are policies provided in the Evaluation Challenge, and what does ‘ranking’ entail?",
305
- "In the Evaluation Challenge, policies are provided as pre-trained models for a specific task (More to come soon). Your task is to predict and rank these policies (e.g., Policy A > Policy B > Policy C) based on their expected success rate or efficiency in the real world. Your ranking is scored against the ground-truth ranking derived from physical deployments.")
306
- display_faq("Are there constraints on model size or computational resources for submissions?",
307
- "There are no strict limits on model size, parameter count, or inference time for any challenge.")
308
- display_faq("Can we fine-tune pre-trained models, and how do we disclose this in our submission?",
309
- "Yes, fine-tuning pre-trained models is allowed as long as they are publicly available (e.g., from Hugging Face, GitHub, or academic repositories) and not trained on private datasets. ")
310
- display_faq("What happens if my Compression Challenge model achieves a loss below 8.0 with MAGVIT after March 1st?",
311
- "If you submit a Compression Challenge solution using the MAGVIT tokenizer and achieve a loss below 8.0 on our held-out test set, you remain eligible for the original $10K award until August 31, 2025 (six months from March 1st). However, this submission will not qualify for the CVPR/ICCV leaderboard, which uses the Cosmos tokenizer as the new standard.")
312
- display_faq("Do I have to participate in all challenges?",
313
- "No, you may choose to participate in one or more challenges. However, participating in multiple challenges may improve your overall ranking.")
314
- display_faq("Can I work in a team?",
315
- "Yes, team submissions are welcome.")
316
- display_faq("What are the submission deadlines?",
317
- "Deadlines for challenges soon to be announced.")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
318
 
319
  st.markdown("---")
320
 
 
223
  row_gifs = gif_paths[i:i+4]
224
 
225
  # Create columns for this row
226
+ cols = st.columns(min(len(row_gifs), 4))
 
227
  # Display each GIF in its own column
228
  for col, gif_path in zip(cols, row_gifs):
229
  col.image(gif_path, use_container_width=True)
 
292
  st.markdown("---")
293
 
294
  st.markdown("## FAQs")
295
+
296
+ st.markdown("**How should I preprocess the dataset?**")
297
+ st.write(
298
+ "The raw data in the `world_model_raw_data` dataset is provided as-is, with no additional preprocessing. "
299
+ "You are free to apply your own steps—such as resizing, normalization, or augmentation—but please document all preprocessing methods in your technical report."
300
+ )
301
+
302
+ st.markdown("**What output format is expected for predictions?**")
303
+ st.write(
304
+ "For the Sampling Challenge, your model should output a predicted video frame in RGB format at a resolution of 512×512 pixels. "
305
+ "For the Evaluation Challenge, you should submit a ranked list of the provided policies based on their expected real-world performance."
306
+ )
307
+
308
+ st.markdown("**What evaluation metrics will be used?**")
309
+ st.write(
310
+ "The Compression Challenge is evaluated by the model's loss (e.g., cross-entropy or mean squared error). "
311
+ "The Sampling Challenge will use image quality metrics like PSNR and SSIM, while the Evaluation Challenge is assessed by comparing your ranking "
312
+ "against the ground-truth ranking using correlation metrics such as Spearman's rho."
313
+ )
314
+
315
+ st.markdown("**Are there restrictions on using future frames or actions?**")
316
+ st.write(
317
+ "You are allowed to use future actions to condition your frame predictions, but you must not use any actual future frames during inference."
318
+ )
319
+
320
+ st.markdown("**Can I modify the baseline models?**")
321
+ st.write(
322
+ "Yes, you are encouraged to enhance the provided baseline models. However, any modifications must be thoroughly documented in your technical report, "
323
+ "and your final submission must be fully reproducible."
324
+ )
325
+
326
+ st.markdown("**How do I ensure my submission is reproducible?**")
327
+ st.write(
328
+ "Include complete code, configuration files, and clear instructions for running your model. Non-reproducible submissions may be disqualified."
329
+ )
330
+
331
+ st.markdown("**Is there a limit on model size or inference time?**")
332
+ st.write(
333
+ "There are no strict limits on model size or inference time, but solutions will be evaluated on both performance and efficiency."
334
+ )
335
+
336
+ st.markdown("**What if I submit a Compression Challenge solution using the old tokenizer?**")
337
+ st.write(
338
+ "Solutions using the MAGVIT tokenizer that achieve a loss below 8.0 on our held-out test set will continue to be honored for the $10K award "
339
+ "for six months from March 1, 2025. However, these submissions will not be eligible for the CVPR/ICCV competitions, which will use the Cosmos tokenizer as the standard."
340
+ )
341
+
342
 
343
  st.markdown("---")
344