|
--- |
|
library_name: transformers |
|
tags: |
|
- finance |
|
- chat |
|
license: apache-2.0 |
|
datasets: |
|
- sujet-ai/Sujet-Finance-Instruct-177k |
|
language: |
|
- en |
|
base_model: |
|
- HuggingFaceTB/SmolLM2-360M-Instruct |
|
--- |
|
|
|
# FinChat-XS |
|
|
|
FinChat-XS is a lightweight financial domain language model designed to answer questions about finance, markets, investments, and economics in a conversational style. |
|
|
|
## Model Overview |
|
|
|
FinChat-XS is a fine-tuned version of [HuggingFaceTB/SmolLM2-360M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-360M-Instruct), optimized for financial domain conversations using LoRA (Low-Rank Adaptation). With only 360M parameters, it offers a balance between performance and efficiency, making it accessible for deployment on consumer hardware. |
|
|
|
The model combines professional financial knowledge with a conversational communication style, making it suitable for applications where users need expert financial information delivered in an approachable manner. |
|
|
|
|
|
## Repository & Resources |
|
|
|
For full code, training process, and additional details, visit the GitHub repository: |
|
|
|
🔗 [FinLLMOpt Repository](https://github.com/peremartra/FinLLMOpt) |
|
|
|
## How the Model was Created |
|
|
|
FinChat-XS was developed through a focused fine-tuning process designed to enhance financial domain expertise while maintaining conversational abilities: |
|
|
|
1. **Base model selection**: Started with SmolLM2-360M-Instruct, a lightweight instruction-tuned language model |
|
2. **Dataset preparation**: |
|
- Filtered the sujet-ai/Sujet-Finance-Instruct-177k dataset to focus on QA and conversational QA examples |
|
- Applied length filtering to keep responses below 500 characters |
|
- Augmented short conversational QA examples to improve conciseness |
|
|
|
3. **Fine-tuning approach**: |
|
- Applied LoRA (Low-Rank Adaptation) to efficiently fine-tune the model |
|
- Targeted key attention modules (q_proj, v_proj) |
|
- Used rank r=4 and alpha=16 |
|
- Training configuration: |
|
- Batch size: 2 (effective batch size 16 with gradient accumulation) |
|
- Learning rate: 1.5e-4 |
|
- BF16 precision |
|
|
|
## Challenges |
|
The primary challenge encountered during the development of FinChat-XS was the lack of high-quality conversational datasets specifically focused on personal finance. While the Sujet-Finance-Instruct-177k dataset provided valuable financial QA examples, there remains a notable gap in naturalistic, multi-turn conversations about personal financial scenarios. |
|
|
|
## Why Use This Model? |
|
|
|
FinChat-XS offers several advantages for specific use cases: |
|
|
|
- **Efficient deployment**: At only 362MB, it can run on devices with limited resources. |
|
- **Financial domain knowledge**: Fine-tuned specifically on financial QA data |
|
- **Balanced communication style**: Combines professional financial knowledge with conversational delivery |
|
- **Low deployment cost**: Requires significantly less computational resources than larger models |
|
- **Customizable**: The LoRA adapter can be mixed with other adapters or further fine-tuned |
|
|
|
Ideal for: |
|
- Embedded financial assistants in mobile apps |
|
- Personal financial planning tools |
|
- Educational applications about finance and investing |
|
- Customer service automation for financial institutions |
|
- Quick deployment scenarios where larger models aren't practical |
|
|
|
## How to Use the Model |
|
|
|
### Basic Usage with Transformers |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
# Load model and tokenizer |
|
model_name = "oopere/FinChat-XS" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16) |
|
|
|
# Create a conversation |
|
messages = [ |
|
{"role": "user", "content": "What's the difference between stocks and bonds?"} |
|
] |
|
|
|
# Format the prompt using the chat template |
|
prompt = tokenizer.apply_chat_template(messages, tokenize=False) |
|
|
|
# Tokenize the prompt |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
|
# Generate a response |
|
outputs = model.generate( |
|
**inputs, |
|
max_new_tokens=256, |
|
temperature=0.7, |
|
top_p=0.9, |
|
do_sample=True, |
|
repetition_penalty=1.2 |
|
) |
|
|
|
# Decode and print the response |
|
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
### Optimized Inference with 8-bit Quantization |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig |
|
import torch |
|
|
|
# Configure 8-bit quantization |
|
bnb_config = BitsAndBytesConfig( |
|
load_in_8bit=True, |
|
bnb_4bit_compute_dtype=torch.float16 |
|
) |
|
|
|
# Load model with quantization |
|
model = AutoModelForCausalLM.from_pretrained( |
|
"oopere/FinChat-XS", |
|
quantization_config=bnb_config, |
|
device_map="auto" |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained("oopere/FinChat-XS") |
|
|
|
# Continue with the same usage pattern as above |
|
``` |
|
|
|
### Using with LoRA Adapter Only |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
from peft import PeftModel, PeftConfig |
|
|
|
# Load base model |
|
base_model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct") |
|
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct") |
|
|
|
# Load LoRA adapter |
|
peft_model = PeftModel.from_pretrained(base_model, "oopere/qa-adapterFinChat-XS") |
|
|
|
# Continue with the same usage pattern as above |
|
``` |
|
|
|
## Limitations & Considerations |
|
|
|
While FinChat-XS performs well in many financial conversation scenarios, users should be aware of these limitations: |
|
|
|
1. **Knowledge limitations**: The model's knowledge is limited to its training data and has a knowledge cutoff date from the base model (SmolLM2). |
|
|
|
2. **Size trade-offs**: As a 360M parameter model, it has less capacity than larger models (7B+) and may provide less nuanced or detailed responses on complex topics. |
|
|
|
3. **Financial advice disclaimer**: The model is not a certified financial advisor and should not be used for making investment decisions. Its responses should be considered educational, not professional financial advice. |
|
|
|
4. **Domain boundaries**: While focused on finance, the model may struggle with highly specialized financial topics or recent developments not covered in its training data. |
|
|
|
5. **Hallucination potential**: Like all language models, FinChat-XS may occasionally generate plausible-sounding but incorrect information, especially when asked about specific numerical data or complex financial details. |
|
|
|
6. **Style variations**: The model balances formal financial knowledge with a conversational style, which may not be appropriate for all professional contexts. |
|
|
|
7. **Regulatory compliance**: This model has not been specifically audited for compliance with financial regulations in various jurisdictions. |
|
|
|
## Citation |
|
|
|
If you use FinChat-XS in your research or applications, please consider citing it as: |
|
|
|
``` |
|
@misc{oopere2025finchatxs, |
|
author = {Martra, P.}, |
|
title = {FinChat-XS: A Lightweight Financial Domain Chat Language Model}, |
|
year = {2025}, |
|
publisher = {Hugging Face}, |
|
howpublished = {\url{https://huggingface.co/oopere/FinChat-XS}} |
|
} |
|
``` |
|
|
|
## Acknowledgements |
|
|
|
- [HuggingFaceTB](https://huggingface.co/HuggingFaceTB) for creating the SmolLM2 model series |
|
- [Sujet AI](https://huggingface.co/sujet-ai) for their financial instruction dataset |
|
- [Hugging Face](https://huggingface.co/) for providing the infrastructure and tools for model development |