--- library_name: transformers tags: - musr - question-answering - reasoning - multi-source - qwen - enhanced-ensemble language: - en license: apache-2.0 metrics: - accuracy: 1.0 - confidence: 1.1167 - source_usage: 0.9972 datasets: - allenai/qasc --- # Model Card for ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3 ## Model Details ### Model Description Ce modèle est une version hautement optimisée de Qwen-0.5B, spécialement conçue pour exceller dans le raisonnement multi-source (MUSR). Il représente la troisième version de notre architecture d'ensemble améliorée, atteignant des performances exceptionnelles sur le benchmark MUSR. - **Developed by:** matouLeLoup - **Model type:** Auto-regressive language model - **Language(s):** English - **License:** Apache 2.0 - **Finetuned from model:** Qwen/Qwen2-0.5B ## Training and Evaluation ### Training Data - Base model: Qwen-0.5B - Fine-tuning dataset: allenai/qasc ### Evaluation Results Tested on 500 samples from QASC validation set: - Accuracy: 100% - Confidence: 1.1167 (±0.0171) - Source Usage: 99.72% - Response Length: 170.5 words (±22.8) - Reasoning Steps: 1.36 average Confidence Distribution: - >1.1 : 95.8% - 1.0-1.1 : 4.2% - <1.0 : 0% ## Uses ### Direct Use Ce modèle est optimisé pour : - Questions-réponses multi-sources - Raisonnement logique - Analyse et synthèse de documents - Systèmes d'aide à la décision - Applications éducatives ### How to Get Started ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3") tokenizer = AutoTokenizer.from_pretrained("matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3") # Format de prompt optimal prompt = f"""Context: Fact 1: {fact1} Fact 2: {fact2} Question: {question} Choices: {choices} Instructions: 1. Analyze both facts carefully 2. Connect the information 3. Choose the letter (A-H) that best answers the question 4. Explain your reasoning Reasoned Answer:""" # Génération inputs = tokenizer(prompt, return_tensors="pt").to(device) outputs = model.generate( **inputs, max_new_tokens=150, num_beams=5, temperature=0.6, no_repeat_ngram_size=3 ) response = tokenizer.decode(outputs[0], skip_special_tokens=True) # Training details Training Procedure Training Hyperparameters Learning rate: 2e-5 Batch size: 32 Weight decay: 0.1 Warmup steps: 0 Scheduler: polynomial Training regime: bf16 mixed precision # Evaluation Procedure Tested on 500 random samples from QASC validation set Evaluated for accuracy, confidence, and source usage Detailed analysis of reasoning steps and response quality # Limitations and Bias Optimisé spécifiquement pour le format MUSR Nécessite une structuration précise des prompts Conçu pour des questions à choix multiples avec raisonnement # Technical Specifications Base model: Qwen-0.5B Enhanced with optimized generation parameters Uses letter-based answer format (A-H) # Generation config generation_config = { "max_new_tokens": 150, "num_beams": 5, "temperature": 0.6, "do_sample": False, "length_penalty": 1.0, "no_repeat_ngram_size": 3 } @misc{PRYMMAL-EnhancedMUSREnsembleV3, author = {matouLeLoup}, title = {ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3}, year = {2024}, publisher = {Hugging Face}, journal = {Hugging Face Hub}, howpublished = {\url{https://huggingface.co/matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3}} }