prithivMLmods commited on
Commit
24bd0fe
·
verified ·
1 Parent(s): 6439237

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -1
README.md CHANGED
@@ -14,4 +14,53 @@ tags:
14
  # **Calcium 20B**
15
 
16
  Calcium 20B, based on the Llama 3.1 collection of multilingual large language models (LLMs), is a collection of pretrained and instruction-tuned generative models optimized for multilingual dialogue use cases. These models outperform many available open-source alternatives.
17
- Model Architecture: Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions are fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. Calcium 20B is trained on synthetic reasoning datasets for mathematical reasoning and science-based problem solving, focusing on following instructions or keywords embedded in the input.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  # **Calcium 20B**
15
 
16
  Calcium 20B, based on the Llama 3.1 collection of multilingual large language models (LLMs), is a collection of pretrained and instruction-tuned generative models optimized for multilingual dialogue use cases. These models outperform many available open-source alternatives.
17
+ Model Architecture: Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions are fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety. Calcium 20B is trained on synthetic reasoning datasets for mathematical reasoning and science-based problem solving, focusing on following instructions or keywords embedded in the input.
18
+
19
+ # **Use with transformers**
20
+
21
+ Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
22
+
23
+ Make sure to update your transformers installation via `pip install --upgrade transformers`.
24
+
25
+ ```python
26
+ import transformers
27
+ import torch
28
+
29
+ model_id = "prithivMLmods/Calcium-20B"
30
+
31
+ pipeline = transformers.pipeline(
32
+ "text-generation",
33
+ model=model_id,
34
+ model_kwargs={"torch_dtype": torch.bfloat16},
35
+ device_map="auto",
36
+ )
37
+
38
+ messages = [
39
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
40
+ {"role": "user", "content": "Who are you?"},
41
+ ]
42
+
43
+ outputs = pipeline(
44
+ messages,
45
+ max_new_tokens=256,
46
+ )
47
+ print(outputs[0]["generated_text"][-1])
48
+ ```
49
+
50
+ # **Intended Use**
51
+ Calcium 20B is designed for a wide range of multilingual tasks and problem-solving scenarios, particularly in the domains of science, mathematics, and dialogue-based interactions. Its intended applications include:
52
+
53
+ 1. **Multilingual Dialogue**: The model is optimized for conversational AI across multiple languages, making it well-suited for chatbots, virtual assistants, and customer support systems.
54
+ 2. **Mathematical Reasoning**: Calcium 20B is specifically trained on synthetic reasoning datasets, enabling it to solve complex math problems and provide step-by-step explanations.
55
+ 3. **Scientific Problem Solving**: With fine-tuning on science-based datasets, the model can assist in answering scientific queries, generating explanations, and supporting research-related tasks.
56
+ 4. **Instruction Following**: Thanks to supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF), the model excels at following user instructions to generate relevant and aligned responses.
57
+ 5. **Content Generation**: It can be used for generating high-quality, contextually appropriate content in multiple languages, such as articles, summaries, and reports.
58
+
59
+ # **Limitations**
60
+ 1. **Multilingual Accuracy**: While the model is optimized for multilingual tasks, its performance may vary across different languages, with better results in languages with more training data and weaker performance in less-represented languages.
61
+ 2. **Generalization to Complex Real-World Scenarios**: The model may struggle with real-world problems that differ significantly from the synthetic datasets it was trained on.
62
+ 3. **Mathematical Errors**: Despite being trained on math reasoning datasets, the model can occasionally produce incorrect solutions or explanations, particularly for highly complex or unconventional problems.
63
+ 4. **Hallucinations**: Like many large language models, Calcium 20B may generate plausible-sounding but factually incorrect or irrelevant information.
64
+ 5. **Computational Resource Requirements**: As a 20B parameter model, it requires substantial computational resources for inference and fine-tuning, making it less accessible for users with limited hardware.
65
+ 6. **Overfitting on Synthetic Data**: Since the model heavily relies on synthetic datasets for training, its behavior may reflect biases or patterns specific to those datasets, limiting its applicability to more diverse real-world contexts.
66
+ 7. **Safety and Alignment Gaps**: Although fine-tuned using RLHF for helpfulness and safety, the model may still generate harmful or unsafe content in certain edge cases.