File size: 3,539 Bytes
b199e74 233d479 b199e74 233d479 b199e74 233d479 b199e74 233d479 b199e74 233d479 b199e74 233d479 b199e74 233d479 b199e74 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 |
---
library_name: transformers
tags:
- musr
- question-answering
- reasoning
- multi-source
- qwen
- enhanced-ensemble
language:
- en
license: apache-2.0
metrics:
- accuracy: 1.0
- confidence: 1.1167
- source_usage: 0.9972
datasets:
- allenai/qasc
---
# Model Card for ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3
## Model Details
### Model Description
Ce modèle est une version hautement optimisée de Qwen-0.5B, spécialement conçue pour exceller dans le raisonnement multi-source (MUSR). Il représente la troisième version de notre architecture d'ensemble améliorée, atteignant des performances exceptionnelles sur le benchmark MUSR.
- **Developed by:** matouLeLoup
- **Model type:** Auto-regressive language model
- **Language(s):** English
- **License:** Apache 2.0
- **Finetuned from model:** Qwen/Qwen2-0.5B
## Training and Evaluation
### Training Data
- Base model: Qwen-0.5B
- Fine-tuning dataset: allenai/qasc
### Evaluation Results
Tested on 500 samples from QASC validation set:
- Accuracy: 100%
- Confidence: 1.1167 (±0.0171)
- Source Usage: 99.72%
- Response Length: 170.5 words (±22.8)
- Reasoning Steps: 1.36 average
Confidence Distribution:
- >1.1 : 95.8%
- 1.0-1.1 : 4.2%
- <1.0 : 0%
## Uses
### Direct Use
Ce modèle est optimisé pour :
- Questions-réponses multi-sources
- Raisonnement logique
- Analyse et synthèse de documents
- Systèmes d'aide à la décision
- Applications éducatives
### How to Get Started
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3")
tokenizer = AutoTokenizer.from_pretrained("matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3")
# Format de prompt optimal
prompt = f"""Context:
Fact 1: {fact1}
Fact 2: {fact2}
Question: {question}
Choices:
{choices}
Instructions:
1. Analyze both facts carefully
2. Connect the information
3. Choose the letter (A-H) that best answers the question
4. Explain your reasoning
Reasoned Answer:"""
# Génération
inputs = tokenizer(prompt, return_tensors="pt").to(device)
outputs = model.generate(
**inputs,
max_new_tokens=150,
num_beams=5,
temperature=0.6,
no_repeat_ngram_size=3
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
# Training details
Training Procedure
Training Hyperparameters
Learning rate: 2e-5
Batch size: 32
Weight decay: 0.1
Warmup steps: 0
Scheduler: polynomial
Training regime: bf16 mixed precision
# Evaluation Procedure
Tested on 500 random samples from QASC validation set
Evaluated for accuracy, confidence, and source usage
Detailed analysis of reasoning steps and response quality
# Limitations and Bias
Optimisé spécifiquement pour le format MUSR
Nécessite une structuration précise des prompts
Conçu pour des questions à choix multiples avec raisonnement
# Technical Specifications
Base model: Qwen-0.5B
Enhanced with optimized generation parameters
Uses letter-based answer format (A-H)
# Generation config
generation_config = {
"max_new_tokens": 150,
"num_beams": 5,
"temperature": 0.6,
"do_sample": False,
"length_penalty": 1.0,
"no_repeat_ngram_size": 3
}
@misc{PRYMMAL-EnhancedMUSREnsembleV3,
author = {matouLeLoup},
title = {ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3},
year = {2024},
publisher = {Hugging Face},
journal = {Hugging Face Hub},
howpublished = {\url{https://huggingface.co/matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3}}
}
|