Decepticore
/

LLMEVAL1

Model card Files Files and versions Community

Decepticore commited on Oct 18

Commit

51ee2ef

•

1 Parent(s): d57e07c

Create README.md

Files changed (1) hide show

README.md +115 -0

README.md ADDED Viewed

	@@ -0,0 +1,115 @@

+---
+language:
+- en
+license: apache-2.0
+tags:
+- pytorch
+- causal-lm
+- pythia
+datasets:
+- hellaswag
+metrics:
+- accuracy
+---
+# Model Card for EleutherAI/pythia-160m HellaSwag Evaluation
+This model card presents the evaluation results of the EleutherAI/pythia-160m model on the HellaSwag task.
+## Model Details
+### Model Description
+- **Developed by:** EleutherAI
+- **Model type:** Causal Language Model
+- **Language(s):** English
+- **License:** Apache 2.0
+- **Finetuned from model:** EleutherAI/pythia-160m
+### Model Sources
+- **Repository:** [EleutherAI/pythia-160m](https://huggingface.co/EleutherAI/pythia-160m)
+- **Paper:** [More Information Needed]
+## Uses
+### Direct Use
+This evaluation demonstrates the model's performance on the HellaSwag task, which tests for commonsense reasoning in AI systems.
+### Out-of-Scope Use
+This evaluation is specific to the HellaSwag task and may not be indicative of the model's performance on other tasks or in real-world applications.
+## Bias, Risks, and Limitations
+The evaluation results should be interpreted within the context of the HellaSwag task. The model may exhibit biases present in its training data or the evaluation dataset.
+### Recommendations
+Users should be aware of the model's limitations and consider additional evaluation on task-specific datasets before deployment in real-world applications.
+## How to Get Started with the Model
+To use this model for the HellaSwag task:
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("EleutherAI/pythia-160m", revision="step100000")
+tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-160m", revision="step100000")
+# Use the model for the HellaSwag task
+```
+## Training Details
+### Training Data
+The model was evaluated on the HellaSwag dataset. For more information, visit [the HellaSwag dataset page](https://huggingface.co/datasets/hellaswag).
+### Training Procedure
+#### Training Hyperparameters
+- **Training regime:** float32
+## Evaluation
+### Testing Data, Factors & Metrics
+#### Testing Data
+The model was evaluated on the HellaSwag dataset, which consists of 10,042 samples.
+#### Metrics
+- **Accuracy (acc):** Measures the proportion of correctly predicted answers.
+- **Normalized Accuracy (acc_norm):** A variant of accuracy that accounts for potential biases in the dataset.
+### Results
+| Metric | Value | Standard Error |
+|--------|-------|----------------|
+| Accuracy | 0.28719 | 0.00452 |
+| Normalized Accuracy | 0.30821 | 0.00461 |
+## Environmental Impact
+- **Hardware Type:** Tesla T4 GPU
+- **Hours used:** Approximately 0.095 hours (341.39 seconds)
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications
+### Model Architecture and Objective
+EleutherAI/pythia-160m is a causal language model with approximately 162 million parameters.
+### Compute Infrastructure
+- **Hardware:** Tesla T4 GPU
+- **Software:** PyTorch 2.4.1+cu121, Transformers 4.44.2
+- **Date of Evaluation:** October 18, 2024