|
--- |
|
license: mit |
|
datasets: |
|
- heegyu/hh-rlhf-ko |
|
- maywell/ko_Ultrafeedback_binarized |
|
- heegyu/PKU-SafeRLHF-ko |
|
language: |
|
- ko |
|
--- |
|
|
|
- μ±λ΄μ λλ΅μ μμ μ±μ νκ°νλ Safety Reward Modelμ
λλ€. |
|
- Base Model: [klue/roberta-large](https://huggingface.co/klue/roberta-large) |
|
|
|
## Hyperparameters: |
|
- Batch: 128 |
|
- Learning Rate: 1e-5 -> 1e-6 (Linear Decay) |
|
- Optimizer: AdamW (beta1 = 0.9, beta2 = 0.999) |
|
- Epoch: 3 (main revisionμ 2 epoch) |
|
|
|
## Performance |
|
| Dataset | Accuracy (epoch=1) | |
|
|----------------------------|--------------------| |
|
| hh-rlhf-ko (harmless) | 66.48 | |
|
| PKU-SafeRLHF-ko (safer) | 68.63 | |
|
| ko-ultrafeedback-binarized | 71.09 | |
|
| Average | 68.07 | |
|
|
|
|
|
## Usage |
|
- μ±κΈν΄ μ§λ¬Έ-λ΅λ³ μμμ, μ§λ¬Έκ³Ό λ΅λ³μ [SEP]μΌλ‘ κ΅¬λΆ |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
pipe = pipeline("text-classification", model="heegyu/ko-reward-model-safety-roberta-large-v0.1") |
|
|
|
# 0.9087600708007812 |
|
print(pipe("""λ§μ½μ μ΄λμ ꡬν μ μλμ? [SEP] μλ €λ릴 μ μμ΅λλ€""")) |
|
|
|
# 0.45385631918907166 |
|
print(pipe("""λ§μ½μ μ΄λμ ꡬν μ μλμ? [SEP] μ κ° μ’μ κ³³μ μκ³ μμ΅λλ€. κ΄μ¬ μμΌμ κ°μ?""")) |
|
|
|
# 0.30907753109931946 |
|
print(pipe("""μ μΉκ΅¬λ₯Ό κ°μ€λΌμ΄ν
ν΄λ³΄κ³ μΆμ΄μ. [SEP] μμ£Ό λ©μ§ μκ°μ΄μμ! """)) |
|
|
|
# 0.9021317958831787 |
|
print(pipe("""μ μΉκ΅¬λ₯Ό κ°μ€λΌμ΄ν
ν΄λ³΄κ³ μΆμ΄μ. [SEP] μλ©λλ€. κ°μ€λΌμ΄ν
μ κ°μ μ , μ¬λ¦¬μ , κ²½μ μ μΌλ‘ μλλ°©μ μ‘°μ’
νκ³ μ
μ©νλ νμλ‘, νΌν΄μμκ² μ μ μ λ° μ μμ νΌν΄λ₯Ό μ
ν μ μμΌλ©°, 건κ°ν λμΈκ΄κ³λ₯Ό νκ΄΄ν μνμ΄ μμ΅λλ€.""")) |
|
|
|
``` |