prithivMLmods commited on
Commit
bf34b5d
·
verified ·
1 Parent(s): 90d312a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -56
README.md CHANGED
@@ -17,75 +17,91 @@ tags:
17
  - Qwen2.5
18
  - text-generation-inference
19
  ---
20
- ### **QwQ-4B-Instruct-Model-Files**
 
 
 
 
 
 
 
21
 
22
  The **QwQ-4B-Instruct** is a lightweight and efficient fine-tuned language model for instruction-following tasks and reasoning. It is based on a quantized version of the **Qwen2.5-7B** model, optimized for inference speed and reduced memory consumption, while retaining robust capabilities for complex tasks.
23
 
24
- | **File Name** | **Size** | **Description** | **Upload Status** |
25
- |----------------------------------|-----------------|---------------------------------------------------|-------------------|
26
- | `.gitattributes` | 1.57 kB | Tracks files stored with Git LFS. | Uploaded |
27
- | `README.md` | 271 Bytes | Basic project documentation. | Updated |
28
- | `added_tokens.json` | 657 Bytes | Specifies additional tokens for the tokenizer. | Uploaded |
29
- | `config.json` | 1.26 kB | Detailed model configuration file. | Uploaded |
30
- | `generation_config.json` | 281 Bytes | Configuration for text generation settings. | Uploaded |
31
- | `merges.txt` | 1.82 MB | Byte pair encoding (BPE) merge rules for tokenizer.| Uploaded |
32
- | `model-00001-of-00002.safetensors`| 4.46 GB | Part 1 of the model weights in safetensors format.| Uploaded (LFS) |
33
- | `model-00002-of-00002.safetensors`| 1.09 GB | Part 2 of the model weights in safetensors format.| Uploaded (LFS) |
34
- | `model.safetensors.index.json` | 124 kB | Index file for safetensors model sharding. | Uploaded |
35
- | `special_tokens_map.json` | 644 Bytes | Mapping of special tokens (e.g., <PAD>, <EOS>). | Uploaded |
36
- | `tokenizer.json` | 11.4 MB | Complete tokenizer configuration. | Uploaded (LFS) |
37
- | `tokenizer_config.json` | 7.73 kB | Settings for the tokenizer integration. | Uploaded |
38
- | `vocab.json` | 2.78 MB | Vocabulary file containing token-to-id mappings. | Uploaded |
39
 
40
- ### **Key Features:**
 
 
 
41
 
42
- 1. **Model Size:**
43
- - **4.46B parameters.**
44
 
45
- 2. **Precision Support:**
46
- - Available in multiple tensor types:
47
- - **FP16**
48
- - **F32**
49
- - **U8 (Quantized)**
50
 
51
- 3. **Model Sharding:**
52
- - The model weights are stored in two parts for efficient download:
53
- - `model-00001-of-00002.safetensors` (4.46 GB)
54
- - `model-00002-of-00002.safetensors` (1.09 GB)
55
- - Indexed with `model.safetensors.index.json`.
56
 
57
- 4. **Tokenizer:**
58
- - Uses Byte-Pair Encoding (BPE).
59
- - Includes:
60
- - `vocab.json` (2.78 MB)
61
- - `merges.txt` (1.82 MB)
62
- - `tokenizer.json` (11.4 MB, pre-trained configuration).
63
- - Special tokens mapped in `special_tokens_map.json` (e.g., `<pad>`, `<eos>`).
64
 
65
- 5. **Configuration Files:**
66
- - `config.json`: Defines the architecture, hyperparameters, and settings.
67
- - `generation_config.json`: Specifies text generation behavior (e.g., max length, temperature).
 
 
 
68
 
69
- ---
 
 
 
 
 
 
 
 
 
 
70
 
71
- ### **Training Dataset:**
72
- - **Dataset Name:** [amphora/QwQ-LongCoT-130K](https://huggingface.co/datasets/amphora/QwQ-LongCoT-130K)
73
- - **Size:** 133k examples.
74
- - **Focus:** Chain-of-Thought reasoning for detailed and logical outputs.
 
 
 
75
 
76
- ---
 
 
77
 
78
- ### **Use Cases:**
79
- 1. **Instruction-Following:**
80
- - Excels in handling concise and multi-step instructions.
81
-
82
- 2. **Reasoning:**
83
- - Well-suited for tasks requiring logical deductions and detailed explanations.
84
 
85
- 3. **Text Generation:**
86
- - Generates coherent and contextually aware responses across various domains.
87
 
88
- 4. **Resource-Constrained Applications:**
89
- - Optimized for scenarios requiring lower computational resources due to its smaller model size and quantization.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
 
91
- ---
 
 
17
  - Qwen2.5
18
  - text-generation-inference
19
  ---
20
+ <pre align="center">
21
+ ________ ________ _____ ___.
22
+ \_____ \ __ _ __\_____ \ / | | \_ |__
23
+ / / \ \ \ \/ \/ / / / \ \ / | |_ | __ \
24
+ / \_/. \ \ / / \_/. \ / ^ / | \_\ \
25
+ \_____\ \_/ \/\_/ \_____\ \_/ \____ | |___ /
26
+ \__> \__> |__| \/
27
+ </pre>
28
 
29
  The **QwQ-4B-Instruct** is a lightweight and efficient fine-tuned language model for instruction-following tasks and reasoning. It is based on a quantized version of the **Qwen2.5-7B** model, optimized for inference speed and reduced memory consumption, while retaining robust capabilities for complex tasks.
30
 
31
+ With its robust natural language processing capabilities, **QwQ-4B-Instruct** excels in generating step-by-step solutions, creative content, and logical analyses. Its architecture integrates advanced understanding of both structured and unstructured data, ensuring precise text generation aligned with user inputs.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
+ - Significantly **more knowledge** and has greatly improved capabilities in **coding** and **mathematics**, thanks to our specialized expert models in these domains.
34
+ - Significant improvements in **instruction following**, **generating long texts** (over 8K tokens), **understanding structured data** (e.g, tables), and **generating structured outputs** especially JSON. **More resilient to the diversity of system prompts**, enhancing role-play implementation and condition-setting for chatbots.
35
+ - **Long-context Support** up to 128K tokens and can generate up to 8K tokens.
36
+ - **Multilingual support** for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
37
 
38
+ # **Demo Start**
 
39
 
40
+ Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.
 
 
 
 
41
 
42
+ ```python
43
+ from transformers import AutoModelForCausalLM, AutoTokenizer
 
 
 
44
 
45
+ model_name = "prithivMLmods/QwQ-4B-Instruct"
 
 
 
 
 
 
46
 
47
+ model = AutoModelForCausalLM.from_pretrained(
48
+ model_name,
49
+ torch_dtype="auto",
50
+ device_map="auto"
51
+ )
52
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
53
 
54
+ prompt = "Give me a short introduction to large language model."
55
+ messages = [
56
+ {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
57
+ {"role": "user", "content": prompt}
58
+ ]
59
+ text = tokenizer.apply_chat_template(
60
+ messages,
61
+ tokenize=False,
62
+ add_generation_prompt=True
63
+ )
64
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
65
 
66
+ generated_ids = model.generate(
67
+ **model_inputs,
68
+ max_new_tokens=512
69
+ )
70
+ generated_ids = [
71
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
72
+ ]
73
 
74
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
75
+ ```
76
+ # **Run with Ollama [Ollama Run]**
77
 
78
+ Ollama makes running machine learning models simple and efficient. Follow these steps to set up and run your GGUF models quickly.
 
 
 
 
 
79
 
80
+ ## Quick Start: Step-by-Step Guide
 
81
 
82
+ | Step | Description | Command / Instructions |
83
+ |------|-------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|
84
+ | 1 | **Install Ollama 🦙** | Download Ollama from [https://ollama.com/download](https://ollama.com/download) and install it on your system. |
85
+ | 2 | **Create Your Model File** | - Create a file named after your model, e.g., `metallama`. |
86
+ | | | - Add the following line to specify the base model: |
87
+ | | | ```bash |
88
+ | | | FROM Llama-3.2-1B.F16.gguf |
89
+ | | | ``` |
90
+ | | | - Ensure the base model file is in the same directory. |
91
+ | 3 | **Create and Patch the Model** | Run the following commands to create and verify your model: |
92
+ | | | ```bash |
93
+ | | | ollama create metallama -f ./metallama |
94
+ | | | ollama list |
95
+ | | | ``` |
96
+ | 4 | **Run the Model** | Use the following command to start your model: |
97
+ | | | ```bash |
98
+ | | | ollama run metallama |
99
+ | | | ``` |
100
+ | 5 | **Interact with the Model** | Once the model is running, interact with it: |
101
+ | | | ```plaintext |
102
+ | | | >>> Tell me about Space X. |
103
+ | | | Space X, the private aerospace company founded by Elon Musk, is revolutionizing space exploration... |
104
+ | | | ``` |
105
 
106
+ ## Conclusion
107
+ With Ollama, running and interacting with models is seamless. Start experimenting today!