Tonic commited on
Commit
dbb337d
Β·
verified Β·
1 Parent(s): 39db0ca

adds default values to experiment name

Browse files
docs/TRACKIO_TRL_FIX.md CHANGED
@@ -21,7 +21,7 @@ However, our custom monitoring implementation didn't provide this interface.
21
  Created a trackio module that provides the exact interface expected by TRL:
22
 
23
  ```python
24
- def init(project_name: str, experiment_name: Optional[str] = None, **kwargs) -> str:
25
  """Initialize trackio experiment (TRL interface)"""
26
 
27
  def log(metrics: Dict[str, Any], step: Optional[int] = None, **kwargs):
@@ -31,6 +31,8 @@ def finish():
31
  """Finish trackio experiment (TRL interface)"""
32
  ```
33
 
 
 
34
  ### 2. Global Trackio Module (`trackio.py`)
35
 
36
  Created a root-level `trackio.py` file that imports from our custom implementation:
@@ -103,20 +105,23 @@ Test results:
103
  βœ… Found required function: init
104
  βœ… Found required function: log
105
  βœ… Found required function: finish
106
- βœ… Trackio initialization successful
 
107
  βœ… Trackio logging successful
108
  βœ… Trackio finish successful
 
109
  βœ… TRL compatibility test passed
110
  βœ… Monitor integration working
111
  ```
112
 
113
  ## Benefits
114
 
115
- 1. **Resolves Training Error**: Fixes the "module trackio has no attribute init" error
116
  2. **Maintains Functionality**: All existing monitoring features continue to work
117
- 3. **TRL Compatibility**: SFTTrainer can now use trackio for logging
118
  4. **Graceful Fallback**: Continues training even if trackio initialization fails
119
  5. **Future-Proof**: Easy to extend with additional TRL-compatible functions
 
120
 
121
  ## Usage
122
 
 
21
  Created a trackio module that provides the exact interface expected by TRL:
22
 
23
  ```python
24
+ def init(project_name: Optional[str] = None, experiment_name: Optional[str] = None, **kwargs) -> str:
25
  """Initialize trackio experiment (TRL interface)"""
26
 
27
  def log(metrics: Dict[str, Any], step: Optional[int] = None, **kwargs):
 
31
  """Finish trackio experiment (TRL interface)"""
32
  ```
33
 
34
+ **Key Feature**: The `init()` function can be called without any arguments, making it compatible with TRL's expectations. It will use environment variables or defaults when no arguments are provided.
35
+
36
  ### 2. Global Trackio Module (`trackio.py`)
37
 
38
  Created a root-level `trackio.py` file that imports from our custom implementation:
 
105
  βœ… Found required function: init
106
  βœ… Found required function: log
107
  βœ… Found required function: finish
108
+ βœ… Trackio initialization with args successful
109
+ βœ… Trackio initialization without args successful
110
  βœ… Trackio logging successful
111
  βœ… Trackio finish successful
112
+ βœ… init() can be called without arguments
113
  βœ… TRL compatibility test passed
114
  βœ… Monitor integration working
115
  ```
116
 
117
  ## Benefits
118
 
119
+ 1. **Resolves Training Error**: Fixes the "module trackio has no attribute init" error and "init() missing 1 required positional argument: 'project_name'" error
120
  2. **Maintains Functionality**: All existing monitoring features continue to work
121
+ 3. **TRL Compatibility**: SFTTrainer can now use trackio for logging, even when called without arguments
122
  4. **Graceful Fallback**: Continues training even if trackio initialization fails
123
  5. **Future-Proof**: Easy to extend with additional TRL-compatible functions
124
+ 6. **Flexible Initialization**: Supports both argument-based and environment-based configuration
125
 
126
  ## Usage
127
 
docs/TRACKIO_TRL_FIX_SUMMARY.md ADDED
@@ -0,0 +1,135 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Trackio TRL Fix - Complete Solution Summary
2
+
3
+ ## Problem Resolution
4
+
5
+ We successfully resolved two related errors:
6
+
7
+ 1. **Original Error**: `ERROR:trainer:Training failed: module 'trackio' has no attribute 'init'`
8
+ 2. **Secondary Error**: `ERROR:train:Training failed: init() missing 1 required positional argument: 'project_name'`
9
+
10
+ ## Root Cause Analysis
11
+
12
+ The TRL library (SFTTrainer) expects a `trackio` module with specific functions:
13
+ - `init()` - Initialize experiment
14
+ - `log()` - Log metrics
15
+ - `finish()` - Finish experiment
16
+
17
+ However, our custom monitoring implementation didn't provide this interface, and when we created it, the `init()` function required a `project_name` argument, but TRL was calling it without any arguments.
18
+
19
+ ## Complete Solution
20
+
21
+ ### 1. Created Trackio Module Interface (`src/trackio.py`)
22
+
23
+ ```python
24
+ def init(project_name: Optional[str] = None, experiment_name: Optional[str] = None, **kwargs) -> str:
25
+ """Initialize trackio experiment (TRL interface)"""
26
+ # Provide default project name if not provided
27
+ if project_name is None:
28
+ project_name = os.environ.get('EXPERIMENT_NAME', 'smollm3_experiment')
29
+ # ... rest of implementation
30
+ ```
31
+
32
+ **Key Features**:
33
+ - βœ… Can be called without arguments (`trackio.init()`)
34
+ - βœ… Uses environment variables for defaults
35
+ - βœ… Maintains backward compatibility with argument-based calls
36
+ - βœ… Integrates with our existing `SmolLM3Monitor` system
37
+
38
+ ### 2. Global Trackio Module (`trackio.py`)
39
+
40
+ Created a root-level module that makes trackio available globally:
41
+
42
+ ```python
43
+ from src.trackio import (
44
+ init, log, finish, log_config, log_checkpoint,
45
+ log_evaluation_results, get_experiment_url, is_available, get_monitor
46
+ )
47
+ ```
48
+
49
+ ### 3. Updated Trainer Integration (`src/trainer.py`)
50
+
51
+ Enhanced trainer to properly initialize trackio with fallback handling:
52
+
53
+ ```python
54
+ # Initialize trackio for TRL compatibility
55
+ try:
56
+ import trackio
57
+ experiment_id = trackio.init(
58
+ project_name=getattr(self.config, 'experiment_name', 'smollm3_experiment'),
59
+ experiment_name=getattr(self.config, 'experiment_name', 'smollm3_experiment'),
60
+ trackio_url=getattr(self.config, 'trackio_url', None),
61
+ trackio_token=getattr(self.config, 'trackio_token', None),
62
+ hf_token=getattr(self.config, 'hf_token', None),
63
+ dataset_repo=getattr(self.config, 'dataset_repo', None)
64
+ )
65
+ logger.info(f"Trackio initialized with experiment ID: {experiment_id}")
66
+ except Exception as e:
67
+ logger.warning(f"Failed to initialize trackio: {e}")
68
+ logger.info("Continuing without trackio integration")
69
+ ```
70
+
71
+ ### 4. Comprehensive Testing
72
+
73
+ Created test suite that verifies:
74
+ - βœ… Function availability (`init`, `log`, `finish`)
75
+ - βœ… Argument-less calls (`trackio.init()`)
76
+ - βœ… Argument-based calls (`trackio.init(project_name="test")`)
77
+ - βœ… TRL compatibility
78
+ - βœ… Monitoring integration
79
+
80
+ ## Test Results
81
+
82
+ ```
83
+ βœ… Successfully imported trackio module
84
+ βœ… Found required function: init
85
+ βœ… Found required function: log
86
+ βœ… Found required function: finish
87
+ βœ… Trackio initialization with args successful
88
+ βœ… Trackio initialization without args successful
89
+ βœ… Trackio logging successful
90
+ βœ… Trackio finish successful
91
+ βœ… init() can be called without arguments
92
+ βœ… TRL compatibility test passed
93
+ βœ… Monitor integration working
94
+ ```
95
+
96
+ ## Benefits Achieved
97
+
98
+ 1. **βœ… Resolves Both Errors**: Fixes both the missing attribute and missing argument errors
99
+ 2. **βœ… TRL Compatibility**: SFTTrainer can now use trackio for logging
100
+ 3. **βœ… Flexible Initialization**: Supports both argument-based and environment-based configuration
101
+ 4. **βœ… Graceful Fallback**: Continues training even if trackio initialization fails
102
+ 5. **βœ… Maintains Functionality**: All existing monitoring features continue to work
103
+ 6. **βœ… Future-Proof**: Easy to extend with additional TRL-compatible functions
104
+
105
+ ## Files Modified
106
+
107
+ - `src/trackio.py` - New trackio module interface with optional arguments
108
+ - `trackio.py` - Global trackio module for TRL
109
+ - `src/trainer.py` - Updated trainer integration with robust error handling
110
+ - `src/__init__.py` - Package exports
111
+ - `tests/test_trackio_trl_fix.py` - Comprehensive test suite
112
+ - `docs/TRACKIO_TRL_FIX.md` - Detailed documentation
113
+
114
+ ## Usage
115
+
116
+ The fix is transparent to users. Training will now work with SFTTrainer and automatically:
117
+
118
+ 1. Initialize trackio when SFTTrainer is created (with or without arguments)
119
+ 2. Log metrics during training
120
+ 3. Finish the experiment when training completes
121
+ 4. Fall back gracefully if trackio is not available
122
+
123
+ ## Verification
124
+
125
+ To verify the fix works:
126
+
127
+ ```bash
128
+ python tests/test_trackio_trl_fix.py
129
+ ```
130
+
131
+ This should show all tests passing and confirm that the trackio module provides the interface expected by TRL library, including support for argument-less calls.
132
+
133
+ ## Next Steps
134
+
135
+ The training should now proceed successfully without the trackio errors. The SFTTrainer will be able to use our custom monitoring system for logging metrics and experiment tracking, with full compatibility with TRL's expectations.
src/trackio.py CHANGED
@@ -17,7 +17,7 @@ logger = logging.getLogger(__name__)
17
  _monitor = None
18
 
19
  def init(
20
- project_name: str,
21
  experiment_name: Optional[str] = None,
22
  **kwargs
23
  ) -> str:
@@ -25,7 +25,7 @@ def init(
25
  Initialize trackio experiment (TRL interface)
26
 
27
  Args:
28
- project_name: Name of the project
29
  experiment_name: Name of the experiment (optional)
30
  **kwargs: Additional configuration parameters
31
 
@@ -35,6 +35,10 @@ def init(
35
  global _monitor
36
 
37
  try:
 
 
 
 
38
  # Extract configuration from kwargs
39
  trackio_url = kwargs.get('trackio_url') or os.environ.get('TRACKIO_URL')
40
  trackio_token = kwargs.get('trackio_token') or os.environ.get('TRACKIO_TOKEN')
 
17
  _monitor = None
18
 
19
  def init(
20
+ project_name: Optional[str] = None,
21
  experiment_name: Optional[str] = None,
22
  **kwargs
23
  ) -> str:
 
25
  Initialize trackio experiment (TRL interface)
26
 
27
  Args:
28
+ project_name: Name of the project (optional, defaults to 'smollm3_experiment')
29
  experiment_name: Name of the experiment (optional)
30
  **kwargs: Additional configuration parameters
31
 
 
35
  global _monitor
36
 
37
  try:
38
+ # Provide default project name if not provided
39
+ if project_name is None:
40
+ project_name = os.environ.get('EXPERIMENT_NAME', 'smollm3_experiment')
41
+
42
  # Extract configuration from kwargs
43
  trackio_url = kwargs.get('trackio_url') or os.environ.get('TRACKIO_URL')
44
  trackio_token = kwargs.get('trackio_token') or os.environ.get('TRACKIO_TOKEN')
src/trainer.py CHANGED
@@ -140,8 +140,8 @@ class SmolLM3Trainer:
140
  import trackio
141
  # Initialize trackio with our configuration
142
  experiment_id = trackio.init(
143
- project_name=self.config.experiment_name,
144
- experiment_name=self.config.experiment_name,
145
  trackio_url=getattr(self.config, 'trackio_url', None),
146
  trackio_token=getattr(self.config, 'trackio_token', None),
147
  hf_token=getattr(self.config, 'hf_token', None),
 
140
  import trackio
141
  # Initialize trackio with our configuration
142
  experiment_id = trackio.init(
143
+ project_name=getattr(self.config, 'experiment_name', 'smollm3_experiment'),
144
+ experiment_name=getattr(self.config, 'experiment_name', 'smollm3_experiment'),
145
  trackio_url=getattr(self.config, 'trackio_url', None),
146
  trackio_token=getattr(self.config, 'trackio_token', None),
147
  hf_token=getattr(self.config, 'hf_token', None),
tests/test_trackio_trl_fix.py CHANGED
@@ -29,14 +29,18 @@ def test_trackio_interface():
29
  print(f"❌ Missing required function: {func_name}")
30
  return False
31
 
32
- # Test initialization
33
  experiment_id = trackio.init(
34
  project_name="test_project",
35
  experiment_name="test_experiment",
36
  trackio_url="https://test.hf.space",
37
  dataset_repo="test/trackio-experiments"
38
  )
39
- print(f"βœ… Trackio initialization successful: {experiment_id}")
 
 
 
 
40
 
41
  # Test logging
42
  metrics = {'loss': 0.5, 'learning_rate': 1e-4}
@@ -73,6 +77,15 @@ def test_trl_compatibility():
73
  init_sig = inspect.signature(trackio.init)
74
  print(f"βœ… init signature: {init_sig}")
75
 
 
 
 
 
 
 
 
 
 
76
  # Check log signature
77
  log_sig = inspect.signature(trackio.log)
78
  print(f"βœ… log signature: {log_sig}")
 
29
  print(f"❌ Missing required function: {func_name}")
30
  return False
31
 
32
+ # Test initialization with arguments
33
  experiment_id = trackio.init(
34
  project_name="test_project",
35
  experiment_name="test_experiment",
36
  trackio_url="https://test.hf.space",
37
  dataset_repo="test/trackio-experiments"
38
  )
39
+ print(f"βœ… Trackio initialization with args successful: {experiment_id}")
40
+
41
+ # Test initialization without arguments (TRL compatibility)
42
+ experiment_id2 = trackio.init()
43
+ print(f"βœ… Trackio initialization without args successful: {experiment_id2}")
44
 
45
  # Test logging
46
  metrics = {'loss': 0.5, 'learning_rate': 1e-4}
 
77
  init_sig = inspect.signature(trackio.init)
78
  print(f"βœ… init signature: {init_sig}")
79
 
80
+ # Test that init can be called without arguments (TRL compatibility)
81
+ try:
82
+ # This simulates what TRL might do
83
+ trackio.init()
84
+ print("βœ… init() can be called without arguments")
85
+ except Exception as e:
86
+ print(f"❌ init() failed when called without arguments: {e}")
87
+ return False
88
+
89
  # Check log signature
90
  log_sig = inspect.signature(trackio.log)
91
  print(f"βœ… log signature: {log_sig}")