Minami-su
/

Qwen1.5-0.5B-Chat_mistral

Text Generation

text-generation-inference

Model card Files Files and versions Community

Minami-su commited on Feb 25, 2024

Commit

7d96ca0

·

verified ·

1 Parent(s): f13ee8a

Update README.md

Files changed (1) hide show

README.md +14 -4

README.md CHANGED Viewed

@@ -22,15 +22,25 @@ This model is converted with https://github.com/Minami-su/character_AI_open/blob
 ## special
-Before using this model, you need to modify modeling_mistral.py in transformers library
-vim /root/anaconda3/envs/train/lib/python3.9/site-packages/transformers/models/mistral/modeling_mistral.py
-find MistralAttention,
-modify q,k,v,o bias=False ----->, bias=config.attention_bias
 Before:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/AKj_fwEoLUKWZ4mViYW-q.png)
 After:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/A2gSwq9l6Zx8X1qegtgvE.png)
 Usage:
 ```python

 ## special
+1.Before using this model, you need to modify modeling_mistral.py in transformers library
+2.vim /root/anaconda3/envs/train/lib/python3.9/site-packages/transformers/models/mistral/modeling_mistral.py
+3.find MistralAttention,
+4.modify q,k,v,o bias=False ----->, bias=config.attention_bias
 Before:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/AKj_fwEoLUKWZ4mViYW-q.png)
 After:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62d7f90b102d144db4b4245b/A2gSwq9l6Zx8X1qegtgvE.png)
+## Differences between qwen2 mistral and qwen2 llamafy
+Compared to qwen2 llamafy,qwen2 mistral can use sliding window attention,qwen2 mistral is faster than qwen2 llamafy, and the context length is better
 Usage:
 ```python