File size: 7,763 Bytes
992d47a 29924b3 4b7e1bc c2e844a 4b7e1bc 03997d7 4b7e1bc dcb2029 80e3dd0 4b7e1bc 71cede3 4b7e1bc 71cede3 4b7e1bc 71cede3 4b7e1bc 71cede3 4b7e1bc 80e3dd0 4b7e1bc ffaf105 4b7e1bc 03997d7 4b7e1bc eb12957 4b7e1bc c2e844a 4b7e1bc 03997d7 c2e844a 4b7e1bc c2e844a eb12957 8d3f161 eb12957 4b7e1bc 03997d7 4b7e1bc 03997d7 4b7e1bc 03997d7 4b7e1bc 2dfca59 2e5a1c6 2dfca59 70db314 ffaf105 dba99f8 ffaf105 fc9346e 2dfca59 eb12957 4b7e1bc ffaf105 4b7e1bc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
import streamlit as st
import pandas as pd
import streamlit.components.v1 as components
def scoring_section():
# Title
st.markdown("## Scoring")
# Intro text
st.write(
"We weight performance across all three challenges, placing additional emphasis on the Evaluation Challenge. "
"Each team's final rank is determined by the total points they accumulate from Compression, Sampling, and Evaluation."
)
# Points Breakdown in a table
st.markdown("### Points Breakdown")
# Create three columns for a more interesting layout
col1, col2, col3 = st.columns(3)
with col1:
st.markdown('<h3 style="margin-left:15px;">Compression</h3>', unsafe_allow_html=True)
st.markdown(
"""
- **1st Place**: 10 points
- **2nd Place**: 7 points
- **3rd Place**: 5 points
"""
)
with col2:
st.markdown('<h3 style="margin-left:15px;">Sampling</h3>', unsafe_allow_html=True)
st.markdown(
"""
- **1st Place**: 10 points
- **2nd Place**: 7 points
- **3rd Place**: 5 points
"""
)
with col3:
st.markdown('<h3 style="margin-left:15px;">Evaluation</h3>', unsafe_allow_html=True)
st.markdown(
"""
- **1st Place**: 20 points
- **2nd Place**: 14 points
- **3rd Place**: 10 points
"""
)
# Tie-Breakers in an expander for a cleaner layout
with st.expander("Tie-Breakers"):
st.write(
"The overall winner will be the team with the highest total points. "
"In the event of a tie, the following tie-breakers will be applied in order:\n\n"
"1. Highest Evaluation Challenge score\n"
"2. Highest Sampling Challenge score\n"
"3. Highest Compression Challenge score\n\n"
)
# Overall Leaderboard Section
st.write(
"The leaderboard, which shows the total points across all challenges, will go live on **March 10th**. "
"Additionally, each challenge—**Compression**, **Sampling**, and **Evaluation**—will have its own leaderboard on their "
"respective Hugging Face submission servers."
)
def main():
st.set_page_config(page_title="1X World Model Challenge")
st.title("World Model Challenge")
st.markdown("## Welcome")
st.write(
"Welcome to the World Model Challenge. This platform hosts three challenges "
"designed to advance research in world models for robotics: Compression, Sampling, and Evaluation."
)
st.markdown("---")
st.markdown("## Motivation")
st.write(
"Real-world robotics faces a fundamental challenge: environments are dynamic and change over time, "
"making consistent evaluation of robot performance difficult. World models offer a solution by "
"learning to simulate complex real-world interactions from raw sensor data. We believe these learned simulators will enable "
"robust evaluation and iterative improvement of robot policies without the constraints of a physical testbed."
)
st.image(
"assets/model_performance_over_time.webp",
caption="An example T-shirt folding model we trained that degrades in performance over the course of 50 days.",
use_container_width=True
)
st.markdown("---")
st.markdown("## The Challenges")
st.markdown("#### Compression Challenge")
st.write(
"In the Compression Challenge, your task is to train a model to compress our robots logs effectively while preserving the critical details needed to understand and predict future interactions. Success in this challenge is measured by the loss of your model—the lower the loss, the better your model captures the complexities of real-world robot behavior."
)
st.markdown("#### Sampling Challenge")
st.write(
"In the Sampling Challenge, your task is to predict a future video frame two seconds in the future given a short clip of robot interactions. The goal is to produce a coherent and plausible continuation of the video, which accurately reflects the dynamics of the scene. Your submission will be judged on how closely it matches the actual frame."
)
st.markdown("#### Evaluation Challenge")
st.write(
"The Evaluation Challenge tackles the ultimate question: Can you predict a robot's performance in the real world without physically deploying it? In this challenge, you will be provided with many different policies for a specific task. Your task is to rank these policies according to their expected real-world performance. This ranking will be compared with the actual ranking of the policies."
)
st.markdown("---")
st.markdown("## Datasets")
st.write(
"We provide two datasets to support the 1X World Model Challenge:\n\n"
"**Raw Data:** The [world_model_raw_data](https://huggingface.co/datasets/1x-technologies/world_model_raw_data) dataset "
"provides raw sensor data, video logs, and annotated robot state sequences gathered from diverse real-world scenarios. "
"This dataset is split into 100 shards—each containing a 512x512 MP4 video, a segment index mapping, and state arrays—"
"and is licensed under CC-BY-NC-SA 4.0.\n\n"
"**Tokenized Data:** The [world_model_tokenized_data](https://huggingface.co/datasets/1x-technologies/world_model_tokenized_data) dataset "
"tokenizes the raw video sequences generated using the NVIDIA Cosmos Tokenizer. This compact representation of the raw data "
"is optimal for the compression challenge and is released under the Apache 2.0 license.\n\n"
)
st.markdown("---")
scoring_section()
def display_faq(question, answer):
st.markdown(
f"""
<div style="
padding: 12px;
margin-bottom: 12px;
background-color: #0d1b2a;
border-radius: 8px;
border: 1px solid #0d1b2a;">
<p style="font-weight: bold; margin: 0 0 4px 0; color: #ffffff;">{question}</p>
<p style="margin: 0; color: #ffffff;">{answer}</p>
</div>
""",
unsafe_allow_html=True
)
st.markdown("## Rules")
st.markdown(
"""
- **Datasets & Pretrained Weights:** The use of publicly available datasets and pretrained weights is allowed. The use of private datasets or pretrained weights is prohibited.
- **Action Conditioning:** Future actions can be used to condition future frame predictions.
- **Inference Time:** There is no limit on the inference time for any of the challenge.
- **Leaderboard & Final Ranking:** The leaderboard will display results on a public test set; however, the final winner will be determined based on performance on a private test set.
- **Reproducibility:** All submissions should be reproducible. Provide code, configuration files, and any necessary instructions to replicate your results.
""",
unsafe_allow_html=True
)
st.markdown("## FAQs")
display_faq("Do I have to participate in all challenges?",
"No, you may choose to participate in one or more challenges. However, participating in multiple challenges may improve your overall ranking.")
display_faq("Can I work in a team?",
"Yes, team submissions are welcome.")
display_faq("What are the submission deadlines?",
"Deadlines for challenges soon to be announced.")
if __name__ == '__main__':
main() |