File size: 2,112 Bytes
825ac58
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

            ---
            library_name: custom
            tags:
            - robotics
            - diffusion
            - mixture-of-experts
            - multi-modal
            license: mit
            datasets:
            - CALVIN
            language:
            - en
            pipeline_tag: robotics
            ---
            # MoDE (Mixture 1of Diffusion Experts) Model

            This model implements a Mixture of Diffusion Experts architecture for robotic manipulation, combining transformer-based processing with expert routing and diffusion-based action prediction.

            ## Model Architecture
            - Base Architecture: MoDE with custom Mixture of Experts Transformer
            - Vision Encoder: {getattr(model_instance, 'resnet_type', 'ResNet')} with FiLM conditioning
            - EMA: Enabled
            - Action Window Size: {model_instance.act_window_size}
            - Sampling Steps: {model_instance.num_sampling_steps}
            - Sampler Type: {model_instance.sampler_type}

            ## Input/Output Specifications
            - RGB Static Camera: (B, T, 3, H, W) tensor
            - RGB Gripper Camera: (B, T, 3, H, W) tensor
            - Language Instructions: Text strings
            - Output: (B, T, 7) tensor representing 7-DoF actions

            ## Usage Example
            ```python
            from huggingface_hub import hf_hub_download
            import torch

            weights_path = hf_hub_download(repo_id="{repo_name}", filename="model_cleaned.safetensors")
            model.load_pretrained_parameters(weights_path)

            obs = {
                "rgb_obs": {
                    "rgb_static": static_image,
                    "rgb_gripper": gripper_image
                }
            }
            goal = {"lang_text": "pick up the blue cube"}
            action = model.step(obs, goal)
            ```

            ## Training Configuration
            - Optimizer: AdamW
            - Learning Rate: {config.optimizer.learning_rate}
            - Weight Decay: {config.optimizer.transformer_weight_decay}