PEFT
Safetensors
dudleymax commited on
Commit
28b1658
·
verified ·
1 Parent(s): 4d72828

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -64
README.md CHANGED
@@ -23,29 +23,31 @@ license: mit
23
 
24
 
25
 
 
 
26
  ## Model Details
27
 
28
  ### Model Description
29
 
30
  <!-- Provide a longer summary of what this model is. -->
31
 
 
32
 
33
-
34
- - **Developed by:** Kevin Geejo, Aniket Yadav, Rishab Pandey
35
- - **Funded by [optional]:** [More Information Needed]
36
- - **Shared by [optional]:** [More Information Needed]
37
- - **Model type:** [More Information Needed]
38
- - **Language(s) (NLP):** [More Information Needed]
39
- - **License:** [More Information Needed]
40
- - **Finetuned from model [optional]:** [More Information Needed]
41
 
42
  ### Model Sources [optional]
43
 
44
  <!-- Provide the basic links for the model. -->
45
 
46
- - **Repository:** [More Information Needed]
47
- - **Paper [optional]:** [More Information Needed]
48
- - **Demo [optional]:** [More Information Needed]
49
 
50
  ## Uses
51
 
@@ -55,37 +57,48 @@ license: mit
55
 
56
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
57
 
58
- [More Information Needed]
 
 
59
 
60
  ### Downstream Use [optional]
61
 
62
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
63
 
64
- [More Information Needed]
65
 
66
  ### Out-of-Scope Use
67
 
68
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
69
 
70
- [More Information Needed]
 
71
 
72
  ## Bias, Risks, and Limitations
73
 
74
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
75
 
76
- [More Information Needed]
 
 
77
 
78
  ### Recommendations
79
 
80
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
81
 
82
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
83
 
84
  ## How to Get Started with the Model
85
 
86
- Use the code below to get started with the model.
 
 
 
 
 
87
 
88
- [More Information Needed]
 
89
 
90
  ## Training Details
91
 
@@ -93,26 +106,18 @@ Use the code below to get started with the model.
93
 
94
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
95
 
96
- [More Information Needed]
97
 
98
  ### Training Procedure
99
 
100
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
101
 
102
- #### Preprocessing [optional]
103
-
104
- [More Information Needed]
105
-
106
-
107
- #### Training Hyperparameters
108
-
109
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
110
 
111
  #### Speeds, Sizes, Times [optional]
112
 
113
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
114
-
115
- [More Information Needed]
116
 
117
  ## Evaluation
118
 
@@ -122,95 +127,90 @@ Use the code below to get started with the model.
122
 
123
  #### Testing Data
124
 
125
- <!-- This should link to a Dataset Card if possible. -->
126
-
127
- [More Information Needed]
128
 
129
  #### Factors
130
 
131
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
132
-
133
- [More Information Needed]
134
 
135
  #### Metrics
136
 
137
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
138
-
139
- [More Information Needed]
140
 
141
  ### Results
142
 
143
- [More Information Needed]
144
 
145
  #### Summary
146
 
147
-
148
 
149
  ## Model Examination [optional]
150
 
151
  <!-- Relevant interpretability work for the model goes here -->
152
 
153
- [More Information Needed]
154
 
155
  ## Environmental Impact
156
 
157
  <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
158
 
159
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
160
 
161
- - **Hardware Type:** [More Information Needed]
162
- - **Hours used:** [More Information Needed]
163
- - **Cloud Provider:** [More Information Needed]
164
- - **Compute Region:** [More Information Needed]
165
- - **Carbon Emitted:** [More Information Needed]
166
 
167
  ## Technical Specifications [optional]
168
 
169
  ### Model Architecture and Objective
170
 
171
- [More Information Needed]
172
 
173
  ### Compute Infrastructure
174
 
175
- [More Information Needed]
176
 
177
  #### Hardware
178
 
179
- [More Information Needed]
180
 
181
  #### Software
182
 
183
- [More Information Needed]
184
-
185
 
186
- **BibTeX:**
187
 
 
188
 
189
- [More Information Needed]
190
 
191
- **APA:**
192
-
193
- [More Information Needed]
194
 
195
  ## Glossary [optional]
196
 
197
  <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
198
 
199
- [More Information Needed]
 
 
200
 
201
  ## More Information [optional]
202
 
203
- [More Information Needed]
204
 
205
  ## Model Card Authors [optional]
206
 
207
- [More Information Needed]
208
 
209
  ## Model Card Contact
210
 
211
- [More Information Needed]
212
-
213
 
214
  ### Framework versions
215
 
216
- - PEFT 0.7.1
 
 
 
23
 
24
 
25
 
26
+
27
+
28
  ## Model Details
29
 
30
  ### Model Description
31
 
32
  <!-- Provide a longer summary of what this model is. -->
33
 
34
+ This model focuses on fine-tuning the Llama-2 7B large language model for Python code generation. The project leverages Ludwig, an open-source toolkit, and a dataset of 500k Python code samples from Hugging Face. The model applies techniques such as prompt templating, zero-shot inference, and few-shot learning, enhancing the model's performance in generating Python code snippets efficiently.
35
 
36
+ - **Developed by:** Kevin Geejo, Aniket Yadav, Rishab Pandey
37
+ - **Funded by [optional]:** No specific funding agency identified
38
+ - **Shared by [optional]:** No additional sharing information provided
39
+ - **Model type:** Fine-tuned Llama-2 7B for Python code generation
40
+ - **Language(s) (NLP):** Python (for code generation tasks)
41
+ - **License:** Not explicitly mentioned, but Llama-2 models are typically governed by Meta AI’s open-source licensing
42
+ - **Finetuned from model [optional]:** Llama-2 7B (Meta AI, 2023)
 
43
 
44
  ### Model Sources [optional]
45
 
46
  <!-- Provide the basic links for the model. -->
47
 
48
+ - **Repository:** Hugging Face (trained models uploaded, no specific link provided)
49
+ - **Paper [optional]:** Not explicitly mentioned
50
+ - **Demo [optional]:** No demo link provided
51
 
52
  ## Uses
53
 
 
57
 
58
  <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
59
 
60
+ - Python code generation for software development
61
+ - Automation of coding tasks
62
+ - Developer productivity enhancement
63
 
64
  ### Downstream Use [optional]
65
 
66
  <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
67
 
68
+ - Code completion, bug fixing, and Python code translation
69
 
70
  ### Out-of-Scope Use
71
 
72
  <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
73
 
74
+ - Non-Python programming tasks
75
+ - Generation of sensitive, legal, or medical content
76
 
77
  ## Bias, Risks, and Limitations
78
 
79
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
80
 
81
+ - Limited to Python programming tasks
82
+ - Dataset biases from Hugging Face's Python Code Dataset
83
+ - Environmental impact from computational costs during fine-tuning
84
 
85
  ### Recommendations
86
 
87
  <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
88
 
89
+ Users should be aware of computational efficiency trade-offs and potential limitations in generalizing to new Python tasks.
90
 
91
  ## How to Get Started with the Model
92
 
93
+ Use the code below to get started with the model:
94
+
95
+ ```python
96
+ # Example setup (simplified)
97
+ import ludwig
98
+ from transformers import AutoModel
99
 
100
+ model = AutoModel.from_pretrained("llama-2-7b-python")
101
+ ```
102
 
103
  ## Training Details
104
 
 
106
 
107
  <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
108
 
109
+ - 500k Python code samples sourced from Hugging Face
110
 
111
  ### Training Procedure
112
 
113
  <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
114
 
115
+ - **Preprocessing [optional]:** Hugging Face Python Code Dataset
116
+ - **Training regime:** Parameter Efficient Fine-Tuning (PEFT) and Low-Rank Adaptation (LoRA)
 
 
 
 
 
 
117
 
118
  #### Speeds, Sizes, Times [optional]
119
 
120
+ - Not explicitly mentioned in the document
 
 
121
 
122
  ## Evaluation
123
 
 
127
 
128
  #### Testing Data
129
 
130
+ - Derived from Python code datasets on Hugging Face
 
 
131
 
132
  #### Factors
133
 
134
+ - Python code generation tasks
 
 
135
 
136
  #### Metrics
137
 
138
+ - Code correctness and efficiency
 
 
139
 
140
  ### Results
141
 
142
+ - Fine-tuning improved Python code generation performance
143
 
144
  #### Summary
145
 
146
+ The fine-tuned model showed enhanced proficiency in generating Python code snippets, reflecting its adaptability to specific coding tasks.
147
 
148
  ## Model Examination [optional]
149
 
150
  <!-- Relevant interpretability work for the model goes here -->
151
 
152
+ [More Information Needed]
153
 
154
  ## Environmental Impact
155
 
156
  <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
157
 
158
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute).
159
 
160
+ - **Hardware Type:** Not specified
161
+ - **Hours used:** Not specified
162
+ - **Cloud Provider:** Not specified
163
+ - **Compute Region:** Not specified
164
+ - **Carbon Emitted:** Not specified
165
 
166
  ## Technical Specifications [optional]
167
 
168
  ### Model Architecture and Objective
169
 
170
+ - Llama-2 7B model architecture fine-tuned for Python code generation
171
 
172
  ### Compute Infrastructure
173
 
174
+ - Not explicitly mentioned
175
 
176
  #### Hardware
177
 
178
+ - Not specified
179
 
180
  #### Software
181
 
182
+ - Ludwig toolkit and Hugging Face integration
 
183
 
184
+ **BibTeX:**
185
 
186
+ [More Information Needed]
187
 
188
+ **APA:**
189
 
190
+ [More Information Needed]
 
 
191
 
192
  ## Glossary [optional]
193
 
194
  <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
195
 
196
+ - **Llama-2:** Open-source large language model by Meta AI
197
+ - **LoRA (Low-Rank Adaptation):** Efficient fine-tuning method modifying fewer model parameters
198
+ - **PEFT:** Parameter-efficient fine-tuning technique
199
 
200
  ## More Information [optional]
201
 
202
+ [More Information Needed]
203
 
204
  ## Model Card Authors [optional]
205
 
206
+ Kevin Geejo, Aniket Yadav, Rishab Pandey
207
 
208
  ## Model Card Contact
209
 
210
 
211
 
212
  ### Framework versions
213
 
214
+ - **Llama-2 version:** 7B
215
+ - **Ludwig version:** 0.8
216
+ - **Hugging Face integration:** Latest