Decepticore commited on
Commit
51ee2ef
1 Parent(s): d57e07c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +115 -0
README.md ADDED
@@ -0,0 +1,115 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ tags:
6
+ - pytorch
7
+ - causal-lm
8
+ - pythia
9
+ datasets:
10
+ - hellaswag
11
+ metrics:
12
+ - accuracy
13
+ ---
14
+
15
+ # Model Card for EleutherAI/pythia-160m HellaSwag Evaluation
16
+
17
+ This model card presents the evaluation results of the EleutherAI/pythia-160m model on the HellaSwag task.
18
+
19
+ ## Model Details
20
+
21
+ ### Model Description
22
+
23
+ - **Developed by:** EleutherAI
24
+ - **Model type:** Causal Language Model
25
+ - **Language(s):** English
26
+ - **License:** Apache 2.0
27
+ - **Finetuned from model:** EleutherAI/pythia-160m
28
+
29
+ ### Model Sources
30
+
31
+ - **Repository:** [EleutherAI/pythia-160m](https://huggingface.co/EleutherAI/pythia-160m)
32
+ - **Paper:** [More Information Needed]
33
+
34
+ ## Uses
35
+
36
+ ### Direct Use
37
+
38
+ This evaluation demonstrates the model's performance on the HellaSwag task, which tests for commonsense reasoning in AI systems.
39
+
40
+ ### Out-of-Scope Use
41
+
42
+ This evaluation is specific to the HellaSwag task and may not be indicative of the model's performance on other tasks or in real-world applications.
43
+
44
+ ## Bias, Risks, and Limitations
45
+
46
+ The evaluation results should be interpreted within the context of the HellaSwag task. The model may exhibit biases present in its training data or the evaluation dataset.
47
+
48
+ ### Recommendations
49
+
50
+ Users should be aware of the model's limitations and consider additional evaluation on task-specific datasets before deployment in real-world applications.
51
+
52
+ ## How to Get Started with the Model
53
+
54
+ To use this model for the HellaSwag task:
55
+
56
+ ```python
57
+ from transformers import AutoModelForCausalLM, AutoTokenizer
58
+
59
+ model = AutoModelForCausalLM.from_pretrained("EleutherAI/pythia-160m", revision="step100000")
60
+ tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-160m", revision="step100000")
61
+
62
+ # Use the model for the HellaSwag task
63
+ ```
64
+
65
+ ## Training Details
66
+
67
+ ### Training Data
68
+
69
+ The model was evaluated on the HellaSwag dataset. For more information, visit [the HellaSwag dataset page](https://huggingface.co/datasets/hellaswag).
70
+
71
+ ### Training Procedure
72
+
73
+ #### Training Hyperparameters
74
+
75
+ - **Training regime:** float32
76
+
77
+ ## Evaluation
78
+
79
+ ### Testing Data, Factors & Metrics
80
+
81
+ #### Testing Data
82
+
83
+ The model was evaluated on the HellaSwag dataset, which consists of 10,042 samples.
84
+
85
+ #### Metrics
86
+
87
+ - **Accuracy (acc):** Measures the proportion of correctly predicted answers.
88
+ - **Normalized Accuracy (acc_norm):** A variant of accuracy that accounts for potential biases in the dataset.
89
+
90
+ ### Results
91
+
92
+ | Metric | Value | Standard Error |
93
+ |--------|-------|----------------|
94
+ | Accuracy | 0.28719 | 0.00452 |
95
+ | Normalized Accuracy | 0.30821 | 0.00461 |
96
+
97
+ ## Environmental Impact
98
+
99
+ - **Hardware Type:** Tesla T4 GPU
100
+ - **Hours used:** Approximately 0.095 hours (341.39 seconds)
101
+ - **Cloud Provider:** [More Information Needed]
102
+ - **Compute Region:** [More Information Needed]
103
+ - **Carbon Emitted:** [More Information Needed]
104
+
105
+ ## Technical Specifications
106
+
107
+ ### Model Architecture and Objective
108
+
109
+ EleutherAI/pythia-160m is a causal language model with approximately 162 million parameters.
110
+
111
+ ### Compute Infrastructure
112
+
113
+ - **Hardware:** Tesla T4 GPU
114
+ - **Software:** PyTorch 2.4.1+cu121, Transformers 4.44.2
115
+ - **Date of Evaluation:** October 18, 2024