Spaces:
Running
Running
adds config attribute for trl compatibility
Browse files- docs/TRACKIO_TRL_FIX.md +42 -24
- docs/TRACKIO_TRL_FIX_SUMMARY.md +29 -123
- src/trackio.py +16 -1
- tests/test_trackio_trl_fix.py +10 -0
- trackio.py +4 -2
docs/TRACKIO_TRL_FIX.md
CHANGED
@@ -1,34 +1,43 @@
|
|
1 |
# Trackio TRL Compatibility Fix
|
2 |
|
3 |
-
## Problem
|
4 |
|
5 |
-
The
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
This error occurred because the TRL library (specifically SFTTrainer) expects a `trackio` module with specific functions:
|
11 |
-
- `init()` - Initialize experiment
|
12 |
-
- `log()` - Log metrics
|
13 |
-
- `finish()` - Finish experiment
|
14 |
|
15 |
-
|
16 |
|
17 |
## Solution Implementation
|
18 |
|
19 |
### 1. Created Trackio Module Interface (`src/trackio.py`)
|
20 |
|
21 |
-
Created a
|
22 |
|
23 |
```python
|
24 |
def init(project_name: Optional[str] = None, experiment_name: Optional[str] = None, **kwargs) -> str:
|
25 |
"""Initialize trackio experiment (TRL interface)"""
|
26 |
-
|
|
|
27 |
def log(metrics: Dict[str, Any], step: Optional[int] = None, **kwargs):
|
28 |
"""Log metrics to trackio (TRL interface)"""
|
29 |
-
|
|
|
30 |
def finish():
|
31 |
"""Finish trackio experiment (TRL interface)"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
32 |
```
|
33 |
|
34 |
**Key Feature**: The `init()` function can be called without any arguments, making it compatible with TRL's expectations. It will use environment variables or defaults when no arguments are provided.
|
@@ -91,27 +100,36 @@ The trackio module integrates seamlessly with our existing monitoring system:
|
|
91 |
- Maintains all existing features (HF Datasets, Trackio Space, etc.)
|
92 |
- Graceful fallback when Trackio Space is not accessible
|
93 |
|
94 |
-
## Testing
|
|
|
|
|
|
|
|
|
95 |
|
96 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
97 |
|
98 |
-
|
99 |
-
2. **TRL Compatibility**: Function signatures match expectations
|
100 |
-
3. **Monitoring Integration**: Works with our custom monitoring system
|
101 |
|
102 |
-
Test results:
|
103 |
```
|
104 |
β
Successfully imported trackio module
|
105 |
β
Found required function: init
|
106 |
β
Found required function: log
|
107 |
β
Found required function: finish
|
108 |
-
β
Trackio initialization with args successful
|
109 |
-
β
Trackio initialization without args successful
|
110 |
β
Trackio logging successful
|
111 |
β
Trackio finish successful
|
112 |
β
init() can be called without arguments
|
113 |
-
β
|
114 |
-
β
|
|
|
|
|
115 |
```
|
116 |
|
117 |
## Benefits
|
|
|
1 |
# Trackio TRL Compatibility Fix
|
2 |
|
3 |
+
## Problem Analysis
|
4 |
|
5 |
+
The TRL library (specifically SFTTrainer) expects a `trackio` module with the following interface:
|
6 |
+
- `trackio.init()` - Initialize experiment tracking
|
7 |
+
- `trackio.log()` - Log metrics during training
|
8 |
+
- `trackio.finish()` - Finish experiment tracking
|
9 |
+
- `trackio.config` - Access configuration (additional requirement discovered)
|
|
|
|
|
|
|
|
|
10 |
|
11 |
+
Our custom monitoring system didn't provide this interface, causing the training to fail.
|
12 |
|
13 |
## Solution Implementation
|
14 |
|
15 |
### 1. Created Trackio Module Interface (`src/trackio.py`)
|
16 |
|
17 |
+
Created a new module that provides the exact interface expected by TRL:
|
18 |
|
19 |
```python
|
20 |
def init(project_name: Optional[str] = None, experiment_name: Optional[str] = None, **kwargs) -> str:
|
21 |
"""Initialize trackio experiment (TRL interface)"""
|
22 |
+
# Implementation that routes to our SmolLM3Monitor
|
23 |
+
|
24 |
def log(metrics: Dict[str, Any], step: Optional[int] = None, **kwargs):
|
25 |
"""Log metrics to trackio (TRL interface)"""
|
26 |
+
# Implementation that routes to our SmolLM3Monitor
|
27 |
+
|
28 |
def finish():
|
29 |
"""Finish trackio experiment (TRL interface)"""
|
30 |
+
# Implementation that routes to our SmolLM3Monitor
|
31 |
+
|
32 |
+
# Added config attribute for TRL compatibility
|
33 |
+
class TrackioConfig:
|
34 |
+
"""Configuration class for trackio (TRL compatibility)"""
|
35 |
+
def __init__(self):
|
36 |
+
self.project_name = os.environ.get('EXPERIMENT_NAME', 'smollm3_experiment')
|
37 |
+
self.experiment_name = os.environ.get('EXPERIMENT_NAME', 'smollm3_experiment')
|
38 |
+
# ... other config properties
|
39 |
+
|
40 |
+
config = TrackioConfig()
|
41 |
```
|
42 |
|
43 |
**Key Feature**: The `init()` function can be called without any arguments, making it compatible with TRL's expectations. It will use environment variables or defaults when no arguments are provided.
|
|
|
100 |
- Maintains all existing features (HF Datasets, Trackio Space, etc.)
|
101 |
- Graceful fallback when Trackio Space is not accessible
|
102 |
|
103 |
+
## Testing and Verification
|
104 |
+
|
105 |
+
### Test Script: `tests/test_trackio_trl_fix.py`
|
106 |
+
|
107 |
+
The test script verifies:
|
108 |
|
109 |
+
1. **Module Import**: `import trackio` works correctly
|
110 |
+
2. **Function Availability**: All required functions (`init`, `log`, `finish`) exist
|
111 |
+
3. **Function Signatures**: Functions have the correct signatures expected by TRL
|
112 |
+
4. **Initialization**: `trackio.init()` can be called with and without arguments
|
113 |
+
5. **Configuration Access**: `trackio.config` is available and accessible
|
114 |
+
6. **Logging**: Metrics can be logged successfully
|
115 |
+
7. **Cleanup**: Experiments can be finished properly
|
116 |
|
117 |
+
### Test Results
|
|
|
|
|
118 |
|
|
|
119 |
```
|
120 |
β
Successfully imported trackio module
|
121 |
β
Found required function: init
|
122 |
β
Found required function: log
|
123 |
β
Found required function: finish
|
124 |
+
β
Trackio initialization with args successful: trl_20250727_135621
|
125 |
+
β
Trackio initialization without args successful: trl_20250727_135621
|
126 |
β
Trackio logging successful
|
127 |
β
Trackio finish successful
|
128 |
β
init() can be called without arguments
|
129 |
+
β
trackio.config is available: <class 'src.trackio.TrackioConfig'>
|
130 |
+
β
config.project_name: smollm3_experiment
|
131 |
+
β
config.experiment_name: smollm3_experiment
|
132 |
+
β
All tests passed! Trackio TRL fix is working correctly.
|
133 |
```
|
134 |
|
135 |
## Benefits
|
docs/TRACKIO_TRL_FIX_SUMMARY.md
CHANGED
@@ -1,135 +1,41 @@
|
|
1 |
# Trackio TRL Fix - Complete Solution Summary
|
2 |
|
3 |
-
##
|
4 |
|
5 |
-
|
6 |
|
7 |
-
|
8 |
-
2. **Secondary Error**: `ERROR:train:Training failed: init() missing 1 required positional argument: 'project_name'`
|
9 |
|
10 |
-
|
|
|
|
|
11 |
|
12 |
-
|
13 |
-
- `init()` - Initialize experiment
|
14 |
-
- `log()` - Log metrics
|
15 |
-
- `finish()` - Finish experiment
|
16 |
|
17 |
-
|
|
|
|
|
|
|
18 |
|
19 |
-
|
|
|
|
|
20 |
|
21 |
-
|
|
|
|
|
|
|
22 |
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
project_name = os.environ.get('EXPERIMENT_NAME', 'smollm3_experiment')
|
29 |
-
# ... rest of implementation
|
30 |
-
```
|
31 |
|
32 |
-
**Key Features
|
33 |
-
- β
Can be called without arguments (`trackio.init()`)
|
34 |
-
- β
Uses environment variables for defaults
|
35 |
-
- β
Maintains backward compatibility with argument-based calls
|
36 |
-
- β
Integrates with our existing `SmolLM3Monitor` system
|
37 |
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
from src.trackio import (
|
44 |
-
init, log, finish, log_config, log_checkpoint,
|
45 |
-
log_evaluation_results, get_experiment_url, is_available, get_monitor
|
46 |
-
)
|
47 |
-
```
|
48 |
-
|
49 |
-
### 3. Updated Trainer Integration (`src/trainer.py`)
|
50 |
-
|
51 |
-
Enhanced trainer to properly initialize trackio with fallback handling:
|
52 |
-
|
53 |
-
```python
|
54 |
-
# Initialize trackio for TRL compatibility
|
55 |
-
try:
|
56 |
-
import trackio
|
57 |
-
experiment_id = trackio.init(
|
58 |
-
project_name=getattr(self.config, 'experiment_name', 'smollm3_experiment'),
|
59 |
-
experiment_name=getattr(self.config, 'experiment_name', 'smollm3_experiment'),
|
60 |
-
trackio_url=getattr(self.config, 'trackio_url', None),
|
61 |
-
trackio_token=getattr(self.config, 'trackio_token', None),
|
62 |
-
hf_token=getattr(self.config, 'hf_token', None),
|
63 |
-
dataset_repo=getattr(self.config, 'dataset_repo', None)
|
64 |
-
)
|
65 |
-
logger.info(f"Trackio initialized with experiment ID: {experiment_id}")
|
66 |
-
except Exception as e:
|
67 |
-
logger.warning(f"Failed to initialize trackio: {e}")
|
68 |
-
logger.info("Continuing without trackio integration")
|
69 |
-
```
|
70 |
-
|
71 |
-
### 4. Comprehensive Testing
|
72 |
-
|
73 |
-
Created test suite that verifies:
|
74 |
-
- β
Function availability (`init`, `log`, `finish`)
|
75 |
-
- β
Argument-less calls (`trackio.init()`)
|
76 |
-
- β
Argument-based calls (`trackio.init(project_name="test")`)
|
77 |
-
- β
TRL compatibility
|
78 |
-
- β
Monitoring integration
|
79 |
-
|
80 |
-
## Test Results
|
81 |
-
|
82 |
-
```
|
83 |
-
β
Successfully imported trackio module
|
84 |
-
β
Found required function: init
|
85 |
-
β
Found required function: log
|
86 |
-
β
Found required function: finish
|
87 |
-
β
Trackio initialization with args successful
|
88 |
-
β
Trackio initialization without args successful
|
89 |
-
β
Trackio logging successful
|
90 |
-
β
Trackio finish successful
|
91 |
-
β
init() can be called without arguments
|
92 |
-
β
TRL compatibility test passed
|
93 |
-
β
Monitor integration working
|
94 |
-
```
|
95 |
-
|
96 |
-
## Benefits Achieved
|
97 |
-
|
98 |
-
1. **β
Resolves Both Errors**: Fixes both the missing attribute and missing argument errors
|
99 |
-
2. **β
TRL Compatibility**: SFTTrainer can now use trackio for logging
|
100 |
-
3. **β
Flexible Initialization**: Supports both argument-based and environment-based configuration
|
101 |
-
4. **β
Graceful Fallback**: Continues training even if trackio initialization fails
|
102 |
-
5. **β
Maintains Functionality**: All existing monitoring features continue to work
|
103 |
-
6. **β
Future-Proof**: Easy to extend with additional TRL-compatible functions
|
104 |
-
|
105 |
-
## Files Modified
|
106 |
-
|
107 |
-
- `src/trackio.py` - New trackio module interface with optional arguments
|
108 |
-
- `trackio.py` - Global trackio module for TRL
|
109 |
-
- `src/trainer.py` - Updated trainer integration with robust error handling
|
110 |
-
- `src/__init__.py` - Package exports
|
111 |
-
- `tests/test_trackio_trl_fix.py` - Comprehensive test suite
|
112 |
-
- `docs/TRACKIO_TRL_FIX.md` - Detailed documentation
|
113 |
-
|
114 |
-
## Usage
|
115 |
-
|
116 |
-
The fix is transparent to users. Training will now work with SFTTrainer and automatically:
|
117 |
-
|
118 |
-
1. Initialize trackio when SFTTrainer is created (with or without arguments)
|
119 |
-
2. Log metrics during training
|
120 |
-
3. Finish the experiment when training completes
|
121 |
-
4. Fall back gracefully if trackio is not available
|
122 |
-
|
123 |
-
## Verification
|
124 |
-
|
125 |
-
To verify the fix works:
|
126 |
-
|
127 |
-
```bash
|
128 |
-
python tests/test_trackio_trl_fix.py
|
129 |
-
```
|
130 |
-
|
131 |
-
This should show all tests passing and confirm that the trackio module provides the interface expected by TRL library, including support for argument-less calls.
|
132 |
-
|
133 |
-
## Next Steps
|
134 |
-
|
135 |
-
The training should now proceed successfully without the trackio errors. The SFTTrainer will be able to use our custom monitoring system for logging metrics and experiment tracking, with full compatibility with TRL's expectations.
|
|
|
1 |
# Trackio TRL Fix - Complete Solution Summary
|
2 |
|
3 |
+
## β
Complete Solution Summary
|
4 |
|
5 |
+
I have successfully implemented a comprehensive fix for the Trackio TRL compatibility issues. Here's what was accomplished:
|
6 |
|
7 |
+
### π§ **Problems Solved**
|
|
|
8 |
|
9 |
+
1. **`module 'trackio' has no attribute 'init'`** - TRL expected trackio.init() function
|
10 |
+
2. **`init() missing 1 required positional argument: 'project_name'`** - TRL called init() without arguments
|
11 |
+
3. **`module 'trackio' has no attribute 'config'`** - TRL expected trackio.config attribute
|
12 |
|
13 |
+
### π οΈ **Solution Components**
|
|
|
|
|
|
|
14 |
|
15 |
+
#### 1. **Trackio Module Interface** (`src/trackio.py`)
|
16 |
+
- Created `init()`, `log()`, `finish()` functions expected by TRL
|
17 |
+
- Added `TrackioConfig` class with `config` attribute
|
18 |
+
- Routes all calls to our custom `SmolLM3Monitor`
|
19 |
|
20 |
+
#### 2. **Global Module Access** (`trackio.py`)
|
21 |
+
- Root-level module that imports from `src.trackio`
|
22 |
+
- Makes functions globally available for TRL import
|
23 |
|
24 |
+
#### 3. **Enhanced Trainer Integration** (`src/trainer.py`)
|
25 |
+
- Explicit trackio initialization before SFTTrainer creation
|
26 |
+
- Proper cleanup with trackio.finish() calls
|
27 |
+
- Robust error handling and fallbacks
|
28 |
|
29 |
+
#### 4. **Comprehensive Testing** (`tests/test_trackio_trl_fix.py`)
|
30 |
+
- Verifies all required functions exist and work
|
31 |
+
- Tests both argument and no-argument initialization
|
32 |
+
- Confirms config attribute accessibility
|
33 |
+
- Validates monitoring integration
|
|
|
|
|
|
|
34 |
|
35 |
+
### π― **Key Features**
|
|
|
|
|
|
|
|
|
36 |
|
37 |
+
- **TRL Compatibility**: Full interface compatibility with TRL library expectations
|
38 |
+
- **Flexible Initialization**: Supports both argument and no-argument init() calls
|
39 |
+
- **Configuration Access**: Provides trackio.config attribute as expected
|
40 |
+
- **Error Resilience**: Graceful fallbacks when external services unavailable
|
41 |
+
- **Monitoring Integration**: Seamless integration with our custom monitoring system
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/trackio.py
CHANGED
@@ -200,4 +200,19 @@ def is_available() -> bool:
|
|
200 |
|
201 |
def get_monitor():
|
202 |
"""Get the current monitor instance (for advanced usage)"""
|
203 |
-
return _monitor
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
200 |
|
201 |
def get_monitor():
|
202 |
"""Get the current monitor instance (for advanced usage)"""
|
203 |
+
return _monitor
|
204 |
+
|
205 |
+
# Add config attribute for TRL compatibility
|
206 |
+
class TrackioConfig:
|
207 |
+
"""Configuration class for trackio (TRL compatibility)"""
|
208 |
+
|
209 |
+
def __init__(self):
|
210 |
+
self.project_name = os.environ.get('EXPERIMENT_NAME', 'smollm3_experiment')
|
211 |
+
self.experiment_name = os.environ.get('EXPERIMENT_NAME', 'smollm3_experiment')
|
212 |
+
self.trackio_url = os.environ.get('TRACKIO_URL')
|
213 |
+
self.trackio_token = os.environ.get('TRACKIO_TOKEN')
|
214 |
+
self.hf_token = os.environ.get('HF_TOKEN')
|
215 |
+
self.dataset_repo = os.environ.get('TRACKIO_DATASET_REPO', 'tonic/trackio-experiments')
|
216 |
+
|
217 |
+
# Create config instance
|
218 |
+
config = TrackioConfig()
|
tests/test_trackio_trl_fix.py
CHANGED
@@ -86,6 +86,16 @@ def test_trl_compatibility():
|
|
86 |
print(f"β init() failed when called without arguments: {e}")
|
87 |
return False
|
88 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
89 |
# Check log signature
|
90 |
log_sig = inspect.signature(trackio.log)
|
91 |
print(f"β
log signature: {log_sig}")
|
|
|
86 |
print(f"β init() failed when called without arguments: {e}")
|
87 |
return False
|
88 |
|
89 |
+
# Test that config attribute is available (TRL compatibility)
|
90 |
+
try:
|
91 |
+
config = trackio.config
|
92 |
+
print(f"β
trackio.config is available: {type(config)}")
|
93 |
+
print(f"β
config.project_name: {config.project_name}")
|
94 |
+
print(f"β
config.experiment_name: {config.experiment_name}")
|
95 |
+
except Exception as e:
|
96 |
+
print(f"β trackio.config failed: {e}")
|
97 |
+
return False
|
98 |
+
|
99 |
# Check log signature
|
100 |
log_sig = inspect.signature(trackio.log)
|
101 |
print(f"β
log signature: {log_sig}")
|
trackio.py
CHANGED
@@ -13,7 +13,8 @@ from src.trackio import (
|
|
13 |
log_evaluation_results,
|
14 |
get_experiment_url,
|
15 |
is_available,
|
16 |
-
get_monitor
|
|
|
17 |
)
|
18 |
|
19 |
# Make all functions available at module level
|
@@ -26,5 +27,6 @@ __all__ = [
|
|
26 |
'log_evaluation_results',
|
27 |
'get_experiment_url',
|
28 |
'is_available',
|
29 |
-
'get_monitor'
|
|
|
30 |
]
|
|
|
13 |
log_evaluation_results,
|
14 |
get_experiment_url,
|
15 |
is_available,
|
16 |
+
get_monitor,
|
17 |
+
config
|
18 |
)
|
19 |
|
20 |
# Make all functions available at module level
|
|
|
27 |
'log_evaluation_results',
|
28 |
'get_experiment_url',
|
29 |
'is_available',
|
30 |
+
'get_monitor',
|
31 |
+
'config'
|
32 |
]
|