littleworth
commited on
Commit
•
b7edfc6
1
Parent(s):
4524379
Update README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ tags:
|
|
9 |
|
10 |
|
11 |
### Model Description
|
12 |
-
This model card describes the distilled version of ProtGPT2, referred to as `protgpt2-distilled-tiny`. The distillation process for this model follows the methodology of knowledge distillation from a larger teacher model to a smaller, more efficient student model. The process combines both "Soft Loss" (Knowledge Distillation Loss) and "Hard Loss" (Cross-Entropy Loss) to ensure the student model not only generalizes like its teacher but also retains practical prediction capabilities.
|
13 |
|
14 |
### Technical Details
|
15 |
**Distillation Parameters:**
|
|
|
9 |
|
10 |
|
11 |
### Model Description
|
12 |
+
This model card describes the distilled version of [ProtGPT2](https://huggingface.co/nferruz/ProtGPT2), referred to as `protgpt2-distilled-tiny`. The distillation process for this model follows the methodology of knowledge distillation from a larger teacher model to a smaller, more efficient student model. The process combines both "Soft Loss" (Knowledge Distillation Loss) and "Hard Loss" (Cross-Entropy Loss) to ensure the student model not only generalizes like its teacher but also retains practical prediction capabilities.
|
13 |
|
14 |
### Technical Details
|
15 |
**Distillation Parameters:**
|