File size: 3,800 Bytes
0920d9a 1ea4b39 98c1407 0920d9a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
---
datasets:
- Omartificial-Intelligence-Space/Arabic-finanical-rag-embedding-dataset
language:
- ar
base_model:
- ybelkada/falcon-7b-sharded-bf16
pipeline_tag: text-generation
library_name: transformers
tags:
- finance
---
# Model: FalconMasr
This model is based on the Falcon-7B model with quantization in 4-bit format for efficient memory usage and fine-tuned using LoRA (Low-Rank Adaptation) for Arabic causal language modeling tasks. The model has been configured to handle causal language modeling tasks specifically designed to improve responses in Arabic.
## Model Configuration
- **Base Model**: `ybelkada/falcon-7b-sharded-bf16`
- **Quantization**: 4-bit with `nf4` quantization type and `float16` computation
- **LoRA Configuration**: `lora_alpha=16`, `lora_dropout=0`, `r=64`
- **Task Type**: Causal Language Modeling
- **Target Modules**: `query_key_value`, `dense`, `dense_h_to_4h`, `dense_4h_to_h`
## Training
The model was fine-tuned on a custom Arabic text dataset, achieving progressive improvements in training loss, as shown in the table below:
| Step | Training Loss |
|------|---------------|
| 10 | 1.459100 |
| 20 | 1.335000 |
| 30 | 1.295600 |
| 40 | 1.177000 |
| 50 | 1.144900 |
| 60 | 1.132900 |
| 70 | 1.074500 |
| 80 | 1.078600 |
| 90 | 1.121100 |
| 100 | 0.936000 |
| 110 | 1.151500 |
| 120 | 1.068000 |
| 130 | 1.056700 |
| 140 | 0.976900 |
| 150 | 0.867300 |
| 160 | 1.151100 |
| 170 | 1.023200 |
| 180 | 1.074300 |
| 190 | 1.036800 |
| 200 | 0.930700 |
| 210 | 0.960800 |
| 220 | 1.098800 |
| 230 | 0.967400 |
| 240 | 0.961700 |
| 250 | 0.871100 |
| 260 | 0.869400 |
| 270 | 0.939500 |
| 280 | 1.087600 |
| 290 | 1.080700 |
| 300 | 0.906800 |
| 310 | 0.901600 |
| 320 | 0.943200 |
| 330 | 0.968900 |
| 340 | 0.986600 |
| 350 | 1.014200 |
| 360 | 1.191700 |
| 370 | 0.992500 |
| 380 | 0.963600 |
| 390 | 0.888800 |
| 400 | 0.746000 |
## Usage
To use this model, load it with the following configuration:
```python
import torch
from transformers import AutoModelForCausalLM,BitsAndBytesConfig
from transformers import AutoTokenizer
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
# Model Configuration
model_name ="MahmoudIbrahim/FalconMasr"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
)
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=bnb_config,
trust_remote_code=True,
low_cpu_mem_usage=True,
)
model.config.use_cache = False
tokenizer =AutoTokenizer.from_pretrained(
model_name,
trust_remote_code=True,
)
tokenizer.pad_token = tokenizer.eos_token
input_text = "كيف تختلف منصة المدفوعات المتكاملة لشركة أمريكان إكسبريس عن شبكات البطاقات المصرفية؟"
# Move inputs to the same device as the model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Set use_reentrant=False for torch checkpointing
torch.utils.checkpoint.checkpoint_sequential.use_reentrant = False
# Tokenize the input text
inputs = tokenizer(input_text, return_tensors="pt").to(device)
# Remove 'token_type_ids' if it's present in the inputs
inputs.pop('token_type_ids', None)
# Generate the output
output = model.generate(**inputs, max_length=200,
use_cache=False,pad_token_id=tokenizer.eos_token_id)
# Decode the generated output
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
print(decoded_output) |