mathis escriva
Update README.md
233d479 verified
metadata
library_name: transformers
tags:
  - musr
  - question-answering
  - reasoning
  - multi-source
  - qwen
  - enhanced-ensemble
language:
  - en
license: apache-2.0
metrics:
  - accuracy: 1
  - confidence: 1.1167
  - source_usage: 0.9972
datasets:
  - allenai/qasc

Model Card for ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3

Model Details

Model Description

Ce modèle est une version hautement optimisée de Qwen-0.5B, spécialement conçue pour exceller dans le raisonnement multi-source (MUSR). Il représente la troisième version de notre architecture d'ensemble améliorée, atteignant des performances exceptionnelles sur le benchmark MUSR.

  • Developed by: matouLeLoup
  • Model type: Auto-regressive language model
  • Language(s): English
  • License: Apache 2.0
  • Finetuned from model: Qwen/Qwen2-0.5B

Training and Evaluation

Training Data

  • Base model: Qwen-0.5B
  • Fine-tuning dataset: allenai/qasc

Evaluation Results

Tested on 500 samples from QASC validation set:

  • Accuracy: 100%
  • Confidence: 1.1167 (±0.0171)
  • Source Usage: 99.72%
  • Response Length: 170.5 words (±22.8)
  • Reasoning Steps: 1.36 average

Confidence Distribution:

  • 1.1 : 95.8%

  • 1.0-1.1 : 4.2%
  • <1.0 : 0%

Uses

Direct Use

Ce modèle est optimisé pour :

  • Questions-réponses multi-sources
  • Raisonnement logique
  • Analyse et synthèse de documents
  • Systèmes d'aide à la décision
  • Applications éducatives

How to Get Started

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3")
tokenizer = AutoTokenizer.from_pretrained("matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3")

# Format de prompt optimal
prompt = f"""Context:
Fact 1: {fact1}
Fact 2: {fact2}

Question: {question}

Choices:
{choices}

Instructions:
1. Analyze both facts carefully
2. Connect the information
3. Choose the letter (A-H) that best answers the question
4. Explain your reasoning

Reasoned Answer:"""

# Génération
inputs = tokenizer(prompt, return_tensors="pt").to(device)
outputs = model.generate(
    **inputs,
    max_new_tokens=150,
    num_beams=5,
    temperature=0.6,
    no_repeat_ngram_size=3
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

# Training details
Training Procedure
Training Hyperparameters

Learning rate: 2e-5
  Batch size: 32
  Weight decay: 0.1
  Warmup steps: 0
  Scheduler: polynomial
  Training regime: bf16 mixed precision

# Evaluation Procedure
  Tested on 500 random samples from QASC validation set
  Evaluated for accuracy, confidence, and source usage
  Detailed analysis of reasoning steps and response quality

# Limitations and Bias

  Optimisé spécifiquement pour le format MUSR
  Nécessite une structuration précise des prompts
  Conçu pour des questions à choix multiples avec raisonnement

# Technical Specifications
  Base model: Qwen-0.5B
  Enhanced with optimized generation parameters
  Uses letter-based answer format (A-H)

# Generation config
generation_config = {
    "max_new_tokens": 150,
    "num_beams": 5,
    "temperature": 0.6,
    "do_sample": False,
    "length_penalty": 1.0,
    "no_repeat_ngram_size": 3
}

@misc{PRYMMAL-EnhancedMUSREnsembleV3,
  author = {matouLeLoup},
  title = {ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3},
  year = {2024},
  publisher = {Hugging Face},
  journal = {Hugging Face Hub},
  howpublished = {\url{https://huggingface.co/matouLeLoup/ECE-PRYMMAL-0.5B-FT-EnhancedMUSREnsembleV3}}
}