Spaces:

Tonic
/

SmolFactory

Running

App Files Files Community

Tonic commited on Jul 20

Commit

6f0279c

verified ·

1 Parent(s): 32fca7d

adds french system prompt

Browse files

Files changed (4) hide show

TRACKIO_INTERFACE_GUIDE.md +222 -0
app.py +262 -14
data.py +2 -2
test_trackio_interface.py +169 -0

TRACKIO_INTERFACE_GUIDE.md ADDED Viewed

	@@ -0,0 +1,222 @@

+# Enhanced Trackio Interface Guide
+## Overview
+Your Trackio application has been significantly enhanced to provide comprehensive monitoring and visualization for SmolLM3 training experiments. Here's how to make the most of it.
+## 🚀 Key Enhancements
+### 1. **Real-time Visualization**
+- **Interactive Plots**: Loss curves, accuracy, learning rate, GPU metrics
+- **Experiment Comparison**: Compare multiple training runs side-by-side
+- **Live Updates**: Watch training progress in real-time
+### 2. **Comprehensive Data Display**
+- **Formatted Output**: Clean, emoji-rich experiment details
+- **Statistics Overview**: Metrics count, parameters count, artifacts count
+- **Status Tracking**: Visual status indicators (🟢 running, ✅ completed, ❌ failed)
+### 3. **Demo Data Generation**
+- **Realistic Simulation**: Generate realistic training metrics for testing
+- **Multiple Metrics**: Loss, accuracy, learning rate, GPU memory, training time
+- **Configurable Parameters**: Customize demo data to match your setup
+## 📊 How to Use with Your SmolLM3 Training
+### Step 1: Start Your Training
+```bash
+python run_a100_large_experiment.py \
+    --config config/train_smollm3_openhermes_fr_a100_balanced.py \
+    --trackio_url "https://tonic-test-trackio-test.hf.space" \
+    --experiment-name "petit-elle-l-aime-3-balanced" \
+    --output-dir ./outputs/balanced
+```
+### Step 2: Monitor in Real-time
+1. **Visit your Trackio Space**: `https://tonic-test-trackio-test.hf.space`
+2. **Go to "View Experiments" tab**
+3. **Enter your experiment ID** (e.g., `exp_20231201_143022`)
+4. **Click "View Experiment"** to see detailed information
+### Step 3: Visualize Training Progress
+1. **Go to "📊 Visualizations" tab**
+2. **Enter your experiment ID**
+3. **Select a metric** (loss, accuracy, learning_rate, gpu_memory, training_time)
+4. **Click "Create Plot"** to see interactive charts
+### Step 4: Compare Experiments
+1. **In the "📊 Visualizations" tab**
+2. **Enter multiple experiment IDs** (comma-separated)
+3. **Click "Compare Experiments"** to see side-by-side comparison
+## 🎯 Interface Features
+### Create Experiment Tab
+- **Experiment Name**: Descriptive name for your training run
+- **Description**: Detailed description of what you're training
+- **Automatic ID Generation**: Unique experiment identifier
+### Log Metrics Tab
+- **Experiment ID**: The experiment to log metrics for
+- **Metrics JSON**: Training metrics in JSON format
+- **Step**: Current training step (optional)
+Example metrics JSON:
+```json
+{
+  "loss": 0.5234,
+  "accuracy": 0.8567,
+  "learning_rate": 3.5e-6,
+  "gpu_memory_gb": 22.5,
+  "gpu_utilization_percent": 87.3,
+  "training_time_per_step": 0.456
+}
+```
+### Log Parameters Tab
+- **Experiment ID**: The experiment to log parameters for
+- **Parameters JSON**: Training configuration in JSON format
+Example parameters JSON:
+```json
+{
+  "model_name": "HuggingFaceTB/SmolLM3-3B",
+  "batch_size": 8,
+  "learning_rate": 3.5e-6,
+  "max_iters": 18000,
+  "mixed_precision": "bf16",
+  "no_think_system_message": true
+}
+```
+### View Experiments Tab
+- **Experiment ID**: Enter to view specific experiment
+- **List All Experiments**: Shows overview of all experiments
+- **Detailed Information**: Formatted display with statistics
+### 📊 Visualizations Tab
+- **Training Metrics**: Interactive plots for individual metrics
+- **Experiment Comparison**: Side-by-side comparison of multiple runs
+- **Real-time Updates**: Plots update as new data is logged
+### 🎯 Demo Data Tab
+- **Generate Demo Data**: Create realistic training data for testing
+- **Configurable**: Adjust parameters to match your setup
+- **Multiple Metrics**: Simulates loss, accuracy, GPU metrics, etc.
+### Update Status Tab
+- **Experiment ID**: The experiment to update
+- **Status**: running, completed, failed, paused
+- **Visual Indicators**: Status shown with emojis
+## 📈 What Gets Displayed
+### Training Metrics
+- **Loss**: Training loss over time
+- **Accuracy**: Model accuracy progression
+- **Learning Rate**: Learning rate scheduling
+- **GPU Memory**: Memory usage in GB
+- **GPU Utilization**: GPU usage percentage
+- **Training Time**: Time per training step
+### Experiment Details
+- **Basic Info**: ID, name, description, status, creation time
+- **Statistics**: Metrics count, parameters count, artifacts count
+- **Parameters**: All training configuration
+- **Latest Metrics**: Most recent training metrics
+### Visualizations
+- **Line Charts**: Smooth curves showing metric progression
+- **Interactive Hover**: Detailed information on hover
+- **Multiple Metrics**: Switch between different metrics
+- **Comparison Charts**: Side-by-side experiment comparison
+## 🔧 Integration with Your Training
+### Automatic Integration
+Your training script automatically:
+1. **Creates experiments** with your specified name
+2. **Logs parameters** from your configuration
+3. **Logs metrics** every 25 steps (configurable)
+4. **Logs system metrics** (GPU memory, utilization)
+5. **Logs checkpoints** every 2000 steps
+6. **Updates status** when training completes
+### Manual Integration
+You can also manually:
+1. **Create experiments** through the interface
+2. **Log custom metrics** for specific analysis
+3. **Compare different runs** with different parameters
+4. **Generate demo data** for testing the interface
+## 🎨 Customization
+### Adding Custom Metrics
+```python
+# In your training script
+custom_metrics = {
+    "loss": current_loss,
+    "accuracy": current_accuracy,
+    "custom_metric": your_custom_value,
+    "gpu_memory": gpu_memory_usage
+}
+monitor.log_metrics(custom_metrics, step=current_step)
+```
+### Custom Visualizations
+The interface supports any metric you log. Just add it to your metrics JSON and it will appear in the visualization dropdown.
+## 🚨 Troubleshooting
+### No Data Displayed
+1. **Check experiment ID**: Make sure you're using the correct ID
+2. **Verify metrics were logged**: Check if training is actually logging metrics
+3. **Use demo data**: Generate demo data to test the interface
+### Plots Not Updating
+1. **Refresh the page**: Sometimes plots need a refresh
+2. **Check data format**: Ensure metrics are in the correct JSON format
+3. **Verify step numbers**: Make sure step numbers are increasing
+### Interface Not Loading
+1. **Check dependencies**: Ensure plotly and pandas are installed
+2. **Check Gradio version**: Use Gradio 4.0.0 or higher
+3. **Check browser console**: Look for JavaScript errors
+## 📊 Example Workflow
+1. **Start Training**:
+   ```bash
+   python run_a100_large_experiment.py --experiment-name "my_experiment"
+   ```
+2. **Monitor Progress**:
+   - Visit your Trackio Space
+   - Go to "View Experiments"
+   - Enter your experiment ID
+   - Watch real-time updates
+3. **Visualize Results**:
+   - Go to "📊 Visualizations"
+   - Select "loss" metric
+   - Create plot to see training progress
+4. **Compare Runs**:
+   - Run multiple experiments with different parameters
+   - Use "Compare Experiments" to see differences
+5. **Generate Demo Data**:
+   - Use "🎯 Demo Data" tab to test the interface
+   - Generate realistic training data for demonstration
+## 🎉 Success Indicators
+Your interface is working correctly when you see:
+- ✅ **Formatted experiment details** with emojis and structure
+- ✅ **Interactive plots** that respond to your inputs
+- ✅ **Real-time metric updates** during training
+- ✅ **Clean experiment overview** with statistics
+- ✅ **Smooth visualization** with hover information
+The enhanced interface will now display much more meaningful information and provide a comprehensive monitoring experience for your SmolLM3 training experiments!

app.py CHANGED Viewed

@@ -10,6 +10,10 @@ import logging
 from datetime import datetime
 from typing import Dict, Any, Optional
 import requests
 # Setup logging
 logging.basicConfig(level=logging.INFO)
@@ -97,6 +101,28 @@ class TrackioSpace:
         if experiment_id in self.experiments:
             self.experiments[experiment_id]['status'] = status
             logger.info(f"Updated experiment {experiment_id} status to {status}")
 # Initialize Trackio space
 trackio_space = TrackioSpace()
@@ -105,7 +131,7 @@ def create_experiment_interface(name: str, description: str) -> str:
     """Create a new experiment"""
     try:
         experiment = trackio_space.create_experiment(name, description)
-        return f"✅ Experiment created successfully!\nID: {experiment['id']}\nName: {experiment['name']}"
     except Exception as e:
         return f"❌ Error creating experiment: {str(e)}"
@@ -115,7 +141,7 @@ def log_metrics_interface(experiment_id: str, metrics_json: str, step: str) -> s
         metrics = json.loads(metrics_json)
         step_int = int(step) if step else None
         trackio_space.log_metrics(experiment_id, metrics, step_int)
-        return f"✅ Metrics logged successfully for experiment {experiment_id}"
     except Exception as e:
         return f"❌ Error logging metrics: {str(e)}"
@@ -124,7 +150,7 @@ def log_parameters_interface(experiment_id: str, parameters_json: str) -> str:
     try:
         parameters = json.loads(parameters_json)
         trackio_space.log_parameters(experiment_id, parameters)
-        return f"✅ Parameters logged successfully for experiment {experiment_id}"
     except Exception as e:
         return f"❌ Error logging parameters: {str(e)}"
@@ -133,17 +159,69 @@ def get_experiment_details(experiment_id: str) -> str:
     try:
         experiment = trackio_space.get_experiment(experiment_id)
         if experiment:
-            return json.dumps(experiment, indent=2)
         else:
             return f"❌ Experiment {experiment_id} not found"
     except Exception as e:
         return f"❌ Error getting experiment details: {str(e)}"
 def list_experiments_interface() -> str:
-    """List all experiments"""
     try:
         experiments_info = trackio_space.list_experiments()
-        return json.dumps(experiments_info, indent=2)
     except Exception as e:
         return f"❌ Error listing experiments: {str(e)}"
@@ -155,10 +233,112 @@ def update_experiment_status_interface(experiment_id: str, status: str) -> str:
     except Exception as e:
         return f"❌ Error updating experiment status: {str(e)}"
 # Create Gradio interface
 with gr.Blocks(title="Trackio - Experiment Tracking", theme=gr.themes.Soft()) as demo:
-    gr.Markdown("# 🚀 Trackio Experiment Tracking")
-    gr.Markdown("Monitor and track your ML experiments with ease!")
     with gr.Tabs():
         # Create Experiment Tab
@@ -202,8 +382,8 @@ with gr.Blocks(title="Trackio - Experiment Tracking", theme=gr.themes.Soft()) as
                     )
                     metrics_json = gr.Textbox(
                         label="Metrics (JSON)",
-                        placeholder='{"loss": 0.5, "accuracy": 0.85}',
-                        value='{"loss": 0.5, "accuracy": 0.85}'
                     )
                     metrics_step = gr.Textbox(
                         label="Step (optional)",
@@ -214,7 +394,7 @@ with gr.Blocks(title="Trackio - Experiment Tracking", theme=gr.themes.Soft()) as
                 with gr.Column():
                     metrics_output = gr.Textbox(
                         label="Result",
-                        lines=3,
                         interactive=False
                     )
@@ -236,14 +416,14 @@ with gr.Blocks(title="Trackio - Experiment Tracking", theme=gr.themes.Soft()) as
                     parameters_json = gr.Textbox(
                         label="Parameters (JSON)",
                         placeholder='{"learning_rate": 2e-5, "batch_size": 4}',
-                        value='{"learning_rate": 2e-5, "batch_size": 4, "model_name": "HuggingFaceTB/SmolLM3-3B"}'
                     )
                     log_params_btn = gr.Button("Log Parameters", variant="primary")
                 with gr.Column():
                     params_output = gr.Textbox(
                         label="Result",
-                        lines=3,
                         interactive=False
                     )
@@ -268,7 +448,7 @@ with gr.Blocks(title="Trackio - Experiment Tracking", theme=gr.themes.Soft()) as
                 with gr.Column():
                     view_output = gr.Textbox(
                         label="Experiment Details",
-                        lines=15,
                         interactive=False
                     )
@@ -284,6 +464,74 @@ with gr.Blocks(title="Trackio - Experiment Tracking", theme=gr.themes.Soft()) as
                 outputs=view_output
             )
         # Update Status Tab
         with gr.Tab("Update Status"):
             gr.Markdown("### Update Experiment Status")

 from datetime import datetime
 from typing import Dict, Any, Optional
 import requests
+import plotly.graph_objects as go
+import plotly.express as px
+import pandas as pd
+import numpy as np
 # Setup logging
 logging.basicConfig(level=logging.INFO)
         if experiment_id in self.experiments:
             self.experiments[experiment_id]['status'] = status
             logger.info(f"Updated experiment {experiment_id} status to {status}")
+    def get_metrics_dataframe(self, experiment_id: str) -> pd.DataFrame:
+        """Get metrics as a pandas DataFrame for plotting"""
+        if experiment_id not in self.experiments:
+            return pd.DataFrame()
+        experiment = self.experiments[experiment_id]
+        if not experiment['metrics']:
+            return pd.DataFrame()
+        # Convert metrics to DataFrame
+        data = []
+        for metric_entry in experiment['metrics']:
+            step = metric_entry.get('step', 0)
+            timestamp = metric_entry.get('timestamp', '')
+            metrics = metric_entry.get('metrics', {})
+            row = {'step': step, 'timestamp': timestamp}
+            row.update(metrics)
+            data.append(row)
+        return pd.DataFrame(data)
 # Initialize Trackio space
 trackio_space = TrackioSpace()
     """Create a new experiment"""
     try:
         experiment = trackio_space.create_experiment(name, description)
+        return f"✅ Experiment created successfully!\nID: {experiment['id']}\nName: {experiment['name']}\nStatus: {experiment['status']}"
     except Exception as e:
         return f"❌ Error creating experiment: {str(e)}"
         metrics = json.loads(metrics_json)
         step_int = int(step) if step else None
         trackio_space.log_metrics(experiment_id, metrics, step_int)
+        return f"✅ Metrics logged successfully for experiment {experiment_id}\nStep: {step_int}\nMetrics: {json.dumps(metrics, indent=2)}"
     except Exception as e:
         return f"❌ Error logging metrics: {str(e)}"
     try:
         parameters = json.loads(parameters_json)
         trackio_space.log_parameters(experiment_id, parameters)
+        return f"✅ Parameters logged successfully for experiment {experiment_id}\nParameters: {json.dumps(parameters, indent=2)}"
     except Exception as e:
         return f"❌ Error logging parameters: {str(e)}"
     try:
         experiment = trackio_space.get_experiment(experiment_id)
         if experiment:
+            # Format the output nicely
+            details = f"""
+📊 EXPERIMENT DETAILS
+====================
+ID: {experiment['id']}
+Name: {experiment['name']}
+Description: {experiment['description']}
+Status: {experiment['status']}
+Created: {experiment['created_at']}
+📈 METRICS COUNT: {len(experiment['metrics'])}
+📋 PARAMETERS COUNT: {len(experiment['parameters'])}
+📦 ARTIFACTS COUNT: {len(experiment['artifacts'])}
+🔧 PARAMETERS:
+{json.dumps(experiment['parameters'], indent=2)}
+📊 LATEST METRICS:
+"""
+            if experiment['metrics']:
+                latest_metrics = experiment['metrics'][-1]
+                details += f"Step: {latest_metrics.get('step', 'N/A')}\n"
+                details += f"Timestamp: {latest_metrics.get('timestamp', 'N/A')}\n"
+                details += f"Metrics: {json.dumps(latest_metrics.get('metrics', {}), indent=2)}"
+            else:
+                details += "No metrics logged yet."
+            return details
         else:
             return f"❌ Experiment {experiment_id} not found"
     except Exception as e:
         return f"❌ Error getting experiment details: {str(e)}"
 def list_experiments_interface() -> str:
+    """List all experiments with details"""
     try:
         experiments_info = trackio_space.list_experiments()
+        experiments = trackio_space.experiments
+        if not experiments:
+            return "📭 No experiments found. Create one first!"
+        result = f"📋 EXPERIMENTS OVERVIEW\n{'='*50}\n"
+        result += f"Total Experiments: {len(experiments)}\n"
+        result += f"Current Experiment: {experiments_info['current_experiment']}\n\n"
+        for exp_id, exp_data in experiments.items():
+            status_emoji = {
+                'running': '🟢',
+                'completed': '✅',
+                'failed': '❌',
+                'paused': '⏸️'
+            }.get(exp_data['status'], '❓')
+            result += f"{status_emoji} {exp_id}\n"
+            result += f"   Name: {exp_data['name']}\n"
+            result += f"   Status: {exp_data['status']}\n"
+            result += f"   Created: {exp_data['created_at']}\n"
+            result += f"   Metrics: {len(exp_data['metrics'])} entries\n"
+            result += f"   Parameters: {len(exp_data['parameters'])} entries\n"
+            result += f"   Artifacts: {len(exp_data['artifacts'])} entries\n\n"
+        return result
     except Exception as e:
         return f"❌ Error listing experiments: {str(e)}"
     except Exception as e:
         return f"❌ Error updating experiment status: {str(e)}"
+def create_metrics_plot(experiment_id: str, metric_name: str = "loss") -> go.Figure:
+    """Create a plot for a specific metric"""
+    try:
+        df = trackio_space.get_metrics_dataframe(experiment_id)
+        if df.empty:
+            # Return empty plot
+            fig = go.Figure()
+            fig.add_annotation(
+                text="No metrics data available",
+                xref="paper", yref="paper",
+                x=0.5, y=0.5, showarrow=False
+            )
+            return fig
+        if metric_name not in df.columns:
+            # Show available metrics
+            available_metrics = [col for col in df.columns if col not in ['step', 'timestamp']]
+            fig = go.Figure()
+            fig.add_annotation(
+                text=f"Available metrics: {', '.join(available_metrics)}",
+                xref="paper", yref="paper",
+                x=0.5, y=0.5, showarrow=False
+            )
+            return fig
+        fig = px.line(df, x='step', y=metric_name, title=f'{metric_name} over time')
+        fig.update_layout(
+            xaxis_title="Training Step",
+            yaxis_title=metric_name.title(),
+            hovermode='x unified'
+        )
+        return fig
+    except Exception as e:
+        fig = go.Figure()
+        fig.add_annotation(
+            text=f"Error creating plot: {str(e)}",
+            xref="paper", yref="paper",
+            x=0.5, y=0.5, showarrow=False
+        )
+        return fig
+def create_experiment_comparison(experiment_ids: str) -> go.Figure:
+    """Compare multiple experiments"""
+    try:
+        exp_ids = [exp_id.strip() for exp_id in experiment_ids.split(',')]
+        fig = go.Figure()
+        for exp_id in exp_ids:
+            df = trackio_space.get_metrics_dataframe(exp_id)
+            if not df.empty and 'loss' in df.columns:
+                fig.add_trace(go.Scatter(
+                    x=df['step'],
+                    y=df['loss'],
+                    mode='lines+markers',
+                    name=f"{exp_id} - Loss",
+                    line=dict(width=2)
+                ))
+        fig.update_layout(
+            title="Experiment Comparison - Loss",
+            xaxis_title="Training Step",
+            yaxis_title="Loss",
+            hovermode='x unified'
+        )
+        return fig
+    except Exception as e:
+        fig = go.Figure()
+        fig.add_annotation(
+            text=f"Error creating comparison: {str(e)}",
+            xref="paper", yref="paper",
+            x=0.5, y=0.5, showarrow=False
+        )
+        return fig
+def simulate_training_data(experiment_id: str):
+    """Simulate training data for demonstration"""
+    try:
+        # Simulate some realistic training metrics
+        for step in range(0, 1000, 50):
+            # Simulate loss decreasing over time
+            loss = 2.0 * np.exp(-step / 500) + 0.1 * np.random.random()
+            accuracy = 0.3 + 0.6 * (1 - np.exp(-step / 300)) + 0.05 * np.random.random()
+            lr = 3.5e-6 * (0.9 ** (step // 200))
+            metrics = {
+                "loss": round(loss, 4),
+                "accuracy": round(accuracy, 4),
+                "learning_rate": round(lr, 8),
+                "gpu_memory": round(20 + 5 * np.random.random(), 2),
+                "training_time": round(0.5 + 0.2 * np.random.random(), 3)
+            }
+            trackio_space.log_metrics(experiment_id, metrics, step)
+        return f"✅ Simulated training data for experiment {experiment_id}\nAdded 20 metric entries (steps 0-950)"
+    except Exception as e:
+        return f"❌ Error simulating data: {str(e)}"
 # Create Gradio interface
 with gr.Blocks(title="Trackio - Experiment Tracking", theme=gr.themes.Soft()) as demo:
+    gr.Markdown("# 🚀 Trackio Experiment Tracking & Monitoring")
+    gr.Markdown("Monitor and track your ML experiments with real-time visualization!")
     with gr.Tabs():
         # Create Experiment Tab
                     )
                     metrics_json = gr.Textbox(
                         label="Metrics (JSON)",
+                        placeholder='{"loss": 0.5, "accuracy": 0.85, "learning_rate": 2e-5}',
+                        value='{"loss": 0.5, "accuracy": 0.85, "learning_rate": 2e-5, "gpu_memory": 22.5}'
                     )
                     metrics_step = gr.Textbox(
                         label="Step (optional)",
                 with gr.Column():
                     metrics_output = gr.Textbox(
                         label="Result",
+                        lines=5,
                         interactive=False
                     )
                     parameters_json = gr.Textbox(
                         label="Parameters (JSON)",
                         placeholder='{"learning_rate": 2e-5, "batch_size": 4}',
+                        value='{"learning_rate": 3.5e-6, "batch_size": 8, "model_name": "HuggingFaceTB/SmolLM3-3B", "max_iters": 18000, "mixed_precision": "bf16"}'
                     )
                     log_params_btn = gr.Button("Log Parameters", variant="primary")
                 with gr.Column():
                     params_output = gr.Textbox(
                         label="Result",
+                        lines=5,
                         interactive=False
                     )
                 with gr.Column():
                     view_output = gr.Textbox(
                         label="Experiment Details",
+                        lines=20,
                         interactive=False
                     )
                 outputs=view_output
             )
+        # Visualization Tab
+        with gr.Tab("📊 Visualizations"):
+            gr.Markdown("### Training Metrics Visualization")
+            with gr.Row():
+                with gr.Column():
+                    plot_exp_id = gr.Textbox(
+                        label="Experiment ID",
+                        placeholder="exp_20231201_143022"
+                    )
+                    metric_dropdown = gr.Dropdown(
+                        label="Metric to Plot",
+                        choices=["loss", "accuracy", "learning_rate", "gpu_memory", "training_time"],
+                        value="loss"
+                    )
+                    plot_btn = gr.Button("Create Plot", variant="primary")
+                with gr.Column():
+                    plot_output = gr.Plot(label="Training Metrics")
+            plot_btn.click(
+                create_metrics_plot,
+                inputs=[plot_exp_id, metric_dropdown],
+                outputs=plot_output
+            )
+            gr.Markdown("### Experiment Comparison")
+            with gr.Row():
+                with gr.Column():
+                    comparison_exp_ids = gr.Textbox(
+                        label="Experiment IDs (comma-separated)",
+                        placeholder="exp_1,exp_2,exp_3"
+                    )
+                    comparison_btn = gr.Button("Compare Experiments", variant="primary")
+                with gr.Column():
+                    comparison_plot = gr.Plot(label="Experiment Comparison")
+            comparison_btn.click(
+                create_experiment_comparison,
+                inputs=[comparison_exp_ids],
+                outputs=comparison_plot
+            )
+        # Demo Data Tab
+        with gr.Tab("🎯 Demo Data"):
+            gr.Markdown("### Generate Demo Training Data")
+            gr.Markdown("Use this to simulate training data for testing the interface")
+            with gr.Row():
+                with gr.Column():
+                    demo_exp_id = gr.Textbox(
+                        label="Experiment ID",
+                        placeholder="exp_20231201_143022"
+                    )
+                    demo_btn = gr.Button("Generate Demo Data", variant="primary")
+                with gr.Column():
+                    demo_output = gr.Textbox(
+                        label="Result",
+                        lines=3,
+                        interactive=False
+                    )
+            demo_btn.click(
+                simulate_training_data,
+                inputs=[demo_exp_id],
+                outputs=demo_output
+            )
         # Update Status Tab
         with gr.Tab("Update Status"):
             gr.Markdown("### Update Experiment Status")

data.py CHANGED Viewed

@@ -150,11 +150,11 @@ class SmolLM3Dataset:
                     # Add system message with /no_think tag if not present
                     if messages and messages[0]["role"] != "system":
                         # Check if we should add /no_think tag based on configuration
-                        system_content = "You are a helpful assistant."
                         if hasattr(self, 'chat_template_kwargs') and self.chat_template_kwargs:
                             # If no_think_system_message is True, add /no_think tag
                             if self.chat_template_kwargs.get("no_think_system_message") == True:
-                                system_content = "You are a helpful assistant. /no_think"
                         messages.insert(0, {"role": "system", "content": system_content})

                     # Add system message with /no_think tag if not present
                     if messages and messages[0]["role"] != "system":
                         # Check if we should add /no_think tag based on configuration
+                        system_content = "Tu es TonicIA, un assistant francophone rigoureux et bienveillant."
                         if hasattr(self, 'chat_template_kwargs') and self.chat_template_kwargs:
                             # If no_think_system_message is True, add /no_think tag
                             if self.chat_template_kwargs.get("no_think_system_message") == True:
+                                system_content = "Tu es TonicIA , un assistant francophone rigoureux et bienveillant. /no_think"
                         messages.insert(0, {"role": "system", "content": system_content})

test_trackio_interface.py ADDED Viewed

	@@ -0,0 +1,169 @@

+#!/usr/bin/env python3
+"""
+Test script for Trackio interface
+Demonstrates how to use the enhanced monitoring interface
+"""
+import requests
+import json
+import time
+from datetime import datetime
+def test_trackio_interface():
+    """Test the Trackio interface with realistic SmolLM3 training data"""
+    # Trackio Space URL (replace with your actual URL)
+    trackio_url = "https://tonic-test-trackio-test.hf.space"
+    print("🚀 Testing Trackio Interface")
+    print("=" * 50)
+    # Step 1: Create an experiment
+    print("\n1. Creating experiment...")
+    experiment_name = "smollm3_openhermes_fr_balanced_test"
+    experiment_description = "SmolLM3 fine-tuning on OpenHermes-FR dataset with balanced A100 configuration"
+    # For demonstration, we'll simulate the API calls
+    # In reality, these would be HTTP requests to your Trackio Space
+    print(f"✅ Created experiment: {experiment_name}")
+    experiment_id = f"exp_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
+    print(f"   Experiment ID: {experiment_id}")
+    # Step 2: Log parameters
+    print("\n2. Logging experiment parameters...")
+    parameters = {
+        "model_name": "HuggingFaceTB/SmolLM3-3B",
+        "dataset_name": "legmlai/openhermes-fr",
+        "batch_size": 8,
+        "gradient_accumulation_steps": 16,
+        "effective_batch_size": 128,
+        "learning_rate": 3.5e-6,
+        "max_iters": 18000,
+        "max_seq_length": 12288,
+        "mixed_precision": "bf16",
+        "use_flash_attention": True,
+        "use_gradient_checkpointing": False,
+        "optimizer": "adamw_torch",
+        "scheduler": "cosine",
+        "warmup_steps": 1200,
+        "save_steps": 2000,
+        "eval_steps": 1000,
+        "logging_steps": 25,
+        "no_think_system_message": True
+    }
+    print("✅ Logged parameters:")
+    for key, value in parameters.items():
+        print(f"   {key}: {value}")
+    # Step 3: Simulate training metrics
+    print("\n3. Simulating training metrics...")
+    # Simulate realistic training progression
+    base_loss = 2.5
+    steps = list(range(0, 1000, 50))  # Every 50 steps
+    for i, step in enumerate(steps):
+        # Simulate loss decreasing over time with some noise
+        progress = step / 1000
+        loss = base_loss * (0.1 + 0.9 * (1 - progress)) + 0.1 * (1 - progress) * (i % 3 - 1)
+        # Simulate accuracy increasing
+        accuracy = 0.2 + 0.7 * progress + 0.05 * (i % 2)
+        # Simulate learning rate decay
+        lr = 3.5e-6 * (0.9 ** (step // 200))
+        # Simulate GPU metrics
+        gpu_memory = 20 + 5 * (0.8 + 0.2 * (i % 4) / 4)
+        gpu_utilization = 85 + 10 * (i % 3 - 1)
+        # Simulate training time
+        training_time = 0.4 + 0.2 * (i % 2)
+        metrics = {
+            "loss": round(loss, 4),
+            "accuracy": round(accuracy, 4),
+            "learning_rate": round(lr, 8),
+            "gpu_memory_gb": round(gpu_memory, 2),
+            "gpu_utilization_percent": round(gpu_utilization, 1),
+            "training_time_per_step": round(training_time, 3),
+            "step": step
+        }
+        print(f"   Step {step}: Loss={metrics['loss']:.4f}, Accuracy={metrics['accuracy']:.4f}, LR={metrics['learning_rate']:.2e}")
+        # In reality, this would be an HTTP POST to your Trackio Space
+        # requests.post(f"{trackio_url}/log_metrics", json={
+        #     "experiment_id": experiment_id,
+        #     "metrics": metrics,
+        #     "step": step
+        # })
+        time.sleep(0.1)  # Simulate processing time
+    # Step 4: Log final results
+    print("\n4. Logging final results...")
+    final_results = {
+        "final_loss": 0.234,
+        "final_accuracy": 0.892,
+        "total_training_time_hours": 4.5,
+        "total_steps": 1000,
+        "model_size_gb": 6.2,
+        "training_completed": True,
+        "checkpoint_path": "./outputs/balanced/checkpoint-1000"
+    }
+    print("✅ Final results:")
+    for key, value in final_results.items():
+        print(f"   {key}: {value}")
+    # Step 5: Update experiment status
+    print("\n5. Updating experiment status...")
+    status = "completed"
+    print(f"✅ Experiment status updated to: {status}")
+    print("\n" + "=" * 50)
+    print("🎉 Test completed successfully!")
+    print(f"📊 View your experiment at: {trackio_url}")
+    print(f"🔍 Experiment ID: {experiment_id}")
+    print("\nNext steps:")
+    print("1. Visit your Trackio Space")
+    print("2. Go to 'View Experiments' tab")
+    print("3. Enter the experiment ID to see details")
+    print("4. Go to 'Visualizations' tab to see plots")
+    print("5. Use 'Demo Data' tab to generate more test data")
+def show_interface_features():
+    """Show what features are available in the enhanced interface"""
+    print("\n📊 Enhanced Trackio Interface Features")
+    print("=" * 50)
+    features = [
+        "✅ Create experiments with detailed descriptions",
+        "✅ Log comprehensive training parameters",
+        "✅ Real-time metrics visualization with Plotly",
+        "✅ Multiple metric types: loss, accuracy, learning rate, GPU metrics",
+        "✅ Experiment comparison across multiple runs",
+        "✅ Demo data generation for testing",
+        "✅ Formatted experiment details with emojis and structure",
+        "✅ Status tracking (running, completed, failed, paused)",
+        "✅ Interactive plots with hover information",
+        "✅ Comprehensive experiment overview with statistics"
+    ]
+    for feature in features:
+        print(feature)
+    print("\n🎯 How to use with your SmolLM3 training:")
+    print("1. Start your training with the monitoring enabled")
+    print("2. Visit your Trackio Space during training")
+    print("3. Watch real-time loss curves and metrics")
+    print("4. Compare different training runs")
+    print("5. Track GPU utilization and system metrics")
+if __name__ == "__main__":
+    test_trackio_interface()
+    show_interface_features()