Update README.md
Browse files
README.md
CHANGED
@@ -192,13 +192,13 @@ Meta developed and released the Meta Llama 3 family of large language models (LL
|
|
192 |
|
193 |
**Model developers** Meta
|
194 |
|
195 |
-
**Variations**
|
196 |
|
197 |
**Input** Models input text only.
|
198 |
|
199 |
**Output** Models generate text and code only.
|
200 |
|
201 |
-
**Model Architecture**
|
202 |
|
203 |
|
204 |
<table>
|
@@ -219,7 +219,7 @@ Meta developed and released the Meta Llama 3 family of large language models (LL
|
|
219 |
</td>
|
220 |
</tr>
|
221 |
<tr>
|
222 |
-
<td rowspan="2" >
|
223 |
</td>
|
224 |
<td rowspan="2" >A new mix of publicly available online data.
|
225 |
</td>
|
@@ -247,11 +247,11 @@ Meta developed and released the Meta Llama 3 family of large language models (LL
|
|
247 |
</table>
|
248 |
|
249 |
|
250 |
-
**Llama 3 family of models**. Token counts refer to pretraining data only.
|
251 |
|
252 |
**Model Release Date** April 18, 2024.
|
253 |
|
254 |
-
**Status** This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback.
|
255 |
|
256 |
**License** A custom commercial license is available at: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
|
257 |
|
@@ -283,18 +283,6 @@ See the snippet below for usage with Transformers:
|
|
283 |
>>> pipeline("Hey how are you doing today?")
|
284 |
```
|
285 |
|
286 |
-
### Use with `llama3`
|
287 |
-
|
288 |
-
Please, follow the instructions in the [repository](https://github.com/meta-llama/llama3).
|
289 |
-
|
290 |
-
To download Original checkpoints, see the example command below leveraging `huggingface-cli`:
|
291 |
-
|
292 |
-
```
|
293 |
-
huggingface-cli download meta-llama/Meta-Llama-3-8B --include "original/*" --local-dir Meta-Llama-3-8B
|
294 |
-
```
|
295 |
-
|
296 |
-
For Hugging Face support, we recommend using transformers or TGI, but a similar command works.
|
297 |
-
|
298 |
## Hardware and Software
|
299 |
|
300 |
**Training Factors** We used custom training libraries and 1xNVIDIA NVIDIA RTX A6000 for fine tuning and quantization. Annotation, and evaluation were also performed on third-party cloud compute.
|
@@ -359,7 +347,7 @@ For Hugging Face support, we recommend using transformers or TGI, but a similar
|
|
359 |
|
360 |
## Benchmarks
|
361 |
|
362 |
-
In this section, we report the results for Llama 3 models on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. For details on the methodology see [here](https://github.com/meta-llama/llama3/blob/main/eval_methodology.md).
|
363 |
|
364 |
|
365 |
### Base pretrained models
|
@@ -640,7 +628,7 @@ In this section, we report the results for Llama 3 models on standard automatic
|
|
640 |
|
641 |
### Responsibility & Safety
|
642 |
|
643 |
-
We
|
644 |
|
645 |
Foundation models are widely capable technologies that are built to be used for a diverse range of applications. They are not designed to meet every developer preference on safety levels for all use cases, out-of-the-box, as those by their nature will differ across different applications.
|
646 |
|
@@ -671,14 +659,14 @@ In addition to responsible use considerations outlined above, we followed a rigo
|
|
671 |
|
672 |
Misuse
|
673 |
|
674 |
-
If you access or use Llama 3, you agree to the Acceptable Use Policy. The most recent copy of this policy can be found at [https://llama.meta.com/llama3/use-policy/](https://llama.meta.com/llama3/use-policy/).
|
675 |
|
676 |
|
677 |
#### Critical risks
|
678 |
|
679 |
<span style="text-decoration:underline;">CBRNE</span> (Chemical, Biological, Radiological, Nuclear, and high yield Explosives)
|
680 |
|
681 |
-
|
682 |
|
683 |
|
684 |
|
@@ -688,7 +676,7 @@ We have conducted a two fold assessment of the safety of the model in this area:
|
|
688 |
|
689 |
### <span style="text-decoration:underline;">Cyber Security </span>
|
690 |
|
691 |
-
|
692 |
|
693 |
|
694 |
### <span style="text-decoration:underline;">Child Safety</span>
|
@@ -705,14 +693,14 @@ Finally, we put in place a set of resources including an [output reporting mecha
|
|
705 |
|
706 |
## Ethical Considerations and Limitations
|
707 |
|
708 |
-
The core values of Llama 3 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress.
|
709 |
|
710 |
-
But Llama 3 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has been in English, and has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of
|
711 |
|
712 |
Please see the Responsible Use Guide available at [http://llama.meta.com/responsible-use-guide](http://llama.meta.com/responsible-use-guide)
|
713 |
|
714 |
|
715 |
-
## Citation
|
716 |
|
717 |
@article{llama3modelcard,
|
718 |
|
@@ -725,4 +713,4 @@ Please see the Responsible Use Guide available at [http://llama.meta.com/respons
|
|
725 |
url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
|
726 |
}
|
727 |
## Contributors
|
728 |
-
|
|
|
192 |
|
193 |
**Model developers** Meta
|
194 |
|
195 |
+
**Variations** R3 is a fine tune of the LLama3-8B-Instruct Base Model.
|
196 |
|
197 |
**Input** Models input text only.
|
198 |
|
199 |
**Output** Models generate text and code only.
|
200 |
|
201 |
+
**Model Architecture** R3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.
|
202 |
|
203 |
|
204 |
<table>
|
|
|
219 |
</td>
|
220 |
</tr>
|
221 |
<tr>
|
222 |
+
<td rowspan="2" >R3
|
223 |
</td>
|
224 |
<td rowspan="2" >A new mix of publicly available online data.
|
225 |
</td>
|
|
|
247 |
</table>
|
248 |
|
249 |
|
250 |
+
**Llama 3 family of models**. Token counts refer to pretraining data only. R3 uses Grouped-Query Attention (GQA) for improved inference scalability.
|
251 |
|
252 |
**Model Release Date** April 18, 2024.
|
253 |
|
254 |
+
**Status** This is a static fine-tuned model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback.
|
255 |
|
256 |
**License** A custom commercial license is available at: [https://llama.meta.com/llama3/license](https://llama.meta.com/llama3/license)
|
257 |
|
|
|
283 |
>>> pipeline("Hey how are you doing today?")
|
284 |
```
|
285 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
286 |
## Hardware and Software
|
287 |
|
288 |
**Training Factors** We used custom training libraries and 1xNVIDIA NVIDIA RTX A6000 for fine tuning and quantization. Annotation, and evaluation were also performed on third-party cloud compute.
|
|
|
347 |
|
348 |
## Benchmarks
|
349 |
|
350 |
+
In this section, we report the results for Llama 3 models, from which R3 was fine tuned, on standard automatic benchmarks. For all the evaluations, we use our internal evaluations library. For details on the methodology see [here](https://github.com/meta-llama/llama3/blob/main/eval_methodology.md).
|
351 |
|
352 |
|
353 |
### Base pretrained models
|
|
|
628 |
|
629 |
### Responsibility & Safety
|
630 |
|
631 |
+
We agree with Meta's philosophy that an open approach to AI leads to better, safer products, faster innovation, and a bigger overall market. We are committed to Responsible AI development and took a series of steps to limit misuse and harm and support the open source community.
|
632 |
|
633 |
Foundation models are widely capable technologies that are built to be used for a diverse range of applications. They are not designed to meet every developer preference on safety levels for all use cases, out-of-the-box, as those by their nature will differ across different applications.
|
634 |
|
|
|
659 |
|
660 |
Misuse
|
661 |
|
662 |
+
If you access or use R3, a fine tuned version of Llama 3, you agree to the Acceptable Use Policy. The most recent copy of this policy can be found at [https://llama.meta.com/llama3/use-policy/](https://llama.meta.com/llama3/use-policy/).
|
663 |
|
664 |
|
665 |
#### Critical risks
|
666 |
|
667 |
<span style="text-decoration:underline;">CBRNE</span> (Chemical, Biological, Radiological, Nuclear, and high yield Explosives)
|
668 |
|
669 |
+
LLaMA3 and by extension R3 undergone a two fold assessment of the safety of the model in this area:
|
670 |
|
671 |
|
672 |
|
|
|
676 |
|
677 |
### <span style="text-decoration:underline;">Cyber Security </span>
|
678 |
|
679 |
+
As a fine-tuned version of LLama3-8B Instruct, R3 has been evaluated by CyberSecEval, Meta’s cybersecurity safety eval suite, measuring Llama 3’s propensity to suggest insecure code when used as a coding assistant, and Llama 3’s propensity to comply with requests to help carry out cyber attacks, where attacks are defined by the industry standard MITRE ATT&CK cyber attack ontology. On our insecure coding and cyber attacker helpfulness tests, Llama 3 behaved in the same range or safer than models of [equivalent coding capability](https://huggingface.co/spaces/facebook/CyberSecEval).
|
680 |
|
681 |
|
682 |
### <span style="text-decoration:underline;">Child Safety</span>
|
|
|
693 |
|
694 |
## Ethical Considerations and Limitations
|
695 |
|
696 |
+
The core values of Llama 3 reflected in R3 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress.
|
697 |
|
698 |
+
But Llama 3 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has been in English, and has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of R3 models, developers should perform safety testing and tuning tailored to their specific applications of the model. As outlined in the Responsible Use Guide, we recommend incorporating [Purple Llama](https://github.com/facebookresearch/PurpleLlama) solutions into your workflows and specifically [Llama Guard](https://ai.meta.com/research/publications/llama-guard-llm-based-input-output-safeguard-for-human-ai-conversations/) which provides a base model to filter input and output prompts to layer system-level safety on top of model-level safety.
|
699 |
|
700 |
Please see the Responsible Use Guide available at [http://llama.meta.com/responsible-use-guide](http://llama.meta.com/responsible-use-guide)
|
701 |
|
702 |
|
703 |
+
## Citation
|
704 |
|
705 |
@article{llama3modelcard,
|
706 |
|
|
|
713 |
url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
|
714 |
}
|
715 |
## Contributors
|
716 |
+
Porter, Matt A. | Founder/CEO Qompass
|