File size: 1,159 Bytes
24cde4f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# DiffRhythm Project Guide

## Commands
- Run web interface: `python app.py`
- Install dependencies: `pip install -r requirements.txt`

## Code Style Guidelines
- **Imports:** Standard lib first, third-party second, local imports last
- **Type Annotations:** Use tensor dimension annotation style
  - `float["b n d"]` where b=batch, n=sequence length, d=dimension
- **Naming:** Functions/variables: `snake_case`, Classes: `PascalCase`, Constants: `UPPER_CASE`
- **Utils:** Use existing helper functions (`exists()`, `default()`)
- **Error Handling:** Use try/except with specific error messages

## Project Structure
- `app.py` - Main entry point with Gradio web interface
- `diffrhythm/` - Core library components:
  - `config/` - Model configs in JSON/INI format
  - `g2p/` - Grapheme-to-phoneme for multiple languages
  - `infer/` - Inference code for music generation
  - `model/` - DiT and CFM model architecture

## Architecture Notes
- Models take lyrics with timestamps (LRC) as input
- Uses audio/text style prompts to guide music generation
- Employs diffusion models that align with lyrics
- Maintain dimension consistency when working with tensors