legolasyiu commited on
Commit
bf9f425
·
verified ·
1 Parent(s): 978563d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +158 -3
README.md CHANGED
@@ -1,16 +1,171 @@
1
  ---
2
- base_model: EpistemeAI/ReasoningCore-Llama-3.2-1B-RE01-0
 
3
  tags:
4
  - text-generation-inference
5
  - transformers
6
  - unsloth
7
  - llama
8
  - trl
9
- license: apache-2.0
10
  language:
11
  - en
12
  ---
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  # Uploaded model
15
 
16
  - **Developed by:** EpistemeAI
@@ -19,4 +174,4 @@ language:
19
 
20
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
1
  ---
2
+ base_model:
3
+ - EpistemeAI/ReasoningCore-1B-r1-0
4
  tags:
5
  - text-generation-inference
6
  - transformers
7
  - unsloth
8
  - llama
9
  - trl
10
+ license: llama3.2
11
  language:
12
  - en
13
  ---
14
 
15
+ Note: This is experimental model
16
+
17
+ ReasoningCore‑1B-r1-0 (Zero)
18
+
19
+ # Very fast reasoning LLM
20
+
21
+ **ReasoningCore‑1B** is a multilingual, reasoning‑enhanced large language model developed by EpitemeAI. Pretrained on vast amounts of publicly available data and instruction‑tuned to excel at nuanced reasoning, dialogue management, retrieval, and summarization tasks, it often outperforms many current open source and proprietary conversational models on a range of industry benchmarks. Fine tuned with reasoning dataset.
22
+
23
+ ---
24
+
25
+ ## Model Information
26
+
27
+ - **Model Developer:** EpitemeAI
28
+ - **Model Architecture:**
29
+ ReasoningCore‑3B is an auto‑regressive language model built on an optimized transformer architecture. It incorporates specialized reasoning pathways and has been fine‑tuned using both supervised learning and reinforcement learning with human feedback (RLHF) to align with human expectations for clarity, accuracy, and safety in complex tasks.
30
+
31
+ | | Training Data | Params | Input Modalities | Output Modalities | Context Length | GQA | Shared Embeddings | Token Count | Knowledge Cutoff |
32
+ |--------------------------------|--------------------------------------------------|--------|-----------------------|------------------------------|----------------|-----|-------------------|----------------|-------------------|
33
+ | **ReasoningCore‑1B (text only)** | A new mix of publicly available online data. | 1B | Multilingual Text | Multilingual Text and code | 128k | Yes | Yes | Up to 9T tokens | December 2023 |
34
+
35
+ - **Supported Languages:**
36
+ Officially supports English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. While the pretraining included a broader range of languages, additional languages can be fine‑tuned in compliance with the community license and acceptable use policies.
37
+ - **Model Release Date:** Sept 25, 2024
38
+ - **Status:** Static model trained on an offline dataset. Future iterations may further enhance its reasoning capabilities and safety features.
39
+ - **License:** Use is governed by the [Llama 3.2 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (a custom, commercial license agreement).
40
+ - **Feedback:** For questions or comments, please refer to the [GitHub repository README](https://github.com/meta-llama/llama-models/tree/main/models/llama3_2) or follow the linked instructions.
41
+
42
+ ---
43
+
44
+ ## Intended Use
45
+
46
+ ### Use Cases
47
+ - **Conversational AI:** Assistant‑like interactions.
48
+ - **Knowledge Retrieval & Summarization:** Dynamic extraction and condensation of information.
49
+ - **Mobile AI‑Powered Writing Assistants:** Query reformulation and natural language generation.
50
+ - **General Natural Language Generation:** Any application that benefits from advanced reasoning abilities.
51
+
52
+ ### Out of Scope
53
+ - Deployments that violate applicable laws or trade compliance regulations.
54
+ - Use cases that conflict with the Acceptable Use Policy or licensing terms.
55
+ - Deployments in languages not explicitly supported (unless additional safety and performance validations are performed).
56
+
57
+ ---
58
+
59
+ ## How to Use
60
+
61
+ ReasoningCore‑1B can be integrated using popular machine learning frameworks. Two primary methods are provided:
62
+
63
+ ## Use system prompt
64
+ ```bash
65
+ SYSTEM_PROMPT = """
66
+ Respond in the following format:
67
+ <reasoning>
68
+ ...
69
+ </reasoning>
70
+ <answer>
71
+ ...
72
+ </answer>
73
+ """
74
+ ```
75
+
76
+ ### Use with Transformers
77
+
78
+ Ensure you have transformers version 4.43.0 or later installed:
79
+
80
+ ```bash
81
+ pip install --upgrade transformers
82
+
83
+ import torch
84
+ from transformers import pipeline
85
+
86
+ model_id = "EpistemeAI/ReasoningCore-Llama-3.2-1B-r1"
87
+ pipe = pipeline(
88
+ "text-generation",
89
+ model=model_id,
90
+ torch_dtype=torch.bfloat16,
91
+ device_map="auto"
92
+ )
93
+ print(pipe("The secret to effective reasoning is"))
94
+ ```
95
+ ## For Mathematical problems
96
+ Please use "Please reason step by step, and put your final answer within \boxed{}" in system prompt
97
+
98
+
99
+ ## Responsibility & Safety
100
+
101
+ ### Responsible Deployment
102
+
103
+ #### Approach:
104
+ - **ReasoningCore‑1B** is a foundational technology that includes built‑in safety guardrails. Developers are encouraged to integrate additional safeguards tailored to their specific applications.
105
+
106
+ #### System‑Level Safety:
107
+ - The model is designed to be deployed as part of a broader system that implements safety measures (e.g., Prompt Guard, Code Shield) to ensure outputs remain safe even under adversarial conditions.
108
+
109
+ ---
110
+
111
+ ### Safety Fine‑Tuning & Data Strategy
112
+
113
+ #### Objectives:
114
+ - Provide a reliable tool for building secure and helpful reasoning systems.
115
+ - Mitigate adversarial misuse through advanced data selection and response optimization techniques.
116
+
117
+ #### Methodology:
118
+ - Incorporate adversarial prompts during training to refine model refusals and response tone.
119
+ - Combine human‑curated data with synthetic data.
120
+ - Utilize iterative fine‑tuning using supervised learning, rejection sampling, and preference optimization.
121
+
122
+ ---
123
+
124
+ ### Evaluations and Red Teaming
125
+
126
+ #### Scaled Evaluations:
127
+ - Dedicated adversarial datasets were used to rigorously test the model’s robustness. Developers should perform context‑specific evaluations.
128
+
129
+ #### Red Teaming:
130
+ - Experts in cybersecurity, adversarial machine learning, and responsible AI conducted recurring red team exercises to identify vulnerabilities and improve both performance and safety.
131
+
132
+ ---
133
+
134
+ ### Critical Risk Mitigations
135
+
136
+ - **CBRNE:**
137
+ The model has been evaluated to ensure it does not enhance capabilities for harmful activities involving chemical, biological, radiological, nuclear, or explosive materials.
138
+
139
+ - **Child Safety:**
140
+ Expert assessments were conducted to evaluate and mitigate potential child safety risks.
141
+
142
+ - **Cyber Attacks:**
143
+ Measures were taken to ensure the model cannot autonomously facilitate cyber‑offensive operations.
144
+
145
+ ---
146
+
147
+ ### Ethical Considerations and Limitations
148
+
149
+ #### Core Values:
150
+ - **ReasoningCore‑1B** is built on the values of openness, inclusivity, and helpfulness. It is designed to respect user autonomy and foster free thought and expression while mitigating potential harm.
151
+
152
+ #### Testing and Limitations:
153
+ - Despite extensive testing across diverse scenarios, the model may occasionally produce inaccurate, biased, or objectionable outputs. Developers must perform additional safety testing and integrate further safeguards as needed.
154
+
155
+ #### Resources for Safe Deployment, with Meta Safety Deployment:
156
+ - [Responsible Use Guide](https://llama.meta.com/responsible-use-guide)
157
+ - [Trust and Safety Resources](https://llama.meta.com/trust-and-safety)
158
+ - [Getting Started Guide](https://llama.meta.com/docs/get-started)
159
+
160
+ ---
161
+
162
+ ### Conclusion
163
+
164
+ **ReasoningCore‑1B** represents a significant advancement in multilingual, reasoning‑enhanced language models. Optimized for tasks requiring deep reasoning, contextual understanding, and safe, helpful interactions, it offers a powerful tool for both commercial and research applications. We invite developers and researchers to explore its capabilities and contribute to building secure, innovative AI systems.
165
+
166
+ For further details, questions, or feedback, please email [email protected]
167
+ ---
168
+
169
  # Uploaded model
170
 
171
  - **Developed by:** EpistemeAI
 
174
 
175
  This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
176
 
177
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)