FinChat-XS / README.md

Update README.md

14cd64e verified 14 days ago

7.36 kB

	---
	library_name: transformers
	tags:
	- finance
	- chat
	license: apache-2.0
	datasets:
	- sujet-ai/Sujet-Finance-Instruct-177k
	language:
	- en
	base_model:
	- HuggingFaceTB/SmolLM2-360M-Instruct
	---

	# FinChat-XS

	FinChat-XS is a lightweight financial domain language model designed to answer questions about finance, markets, investments, and economics in a conversational style.

	## Model Overview

	FinChat-XS is a fine-tuned version of [HuggingFaceTB/SmolLM2-360M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-360M-Instruct), optimized for financial domain conversations using LoRA (Low-Rank Adaptation). With only 360M parameters, it offers a balance between performance and efficiency, making it accessible for deployment on consumer hardware.

	The model combines professional financial knowledge with a conversational communication style, making it suitable for applications where users need expert financial information delivered in an approachable manner.


	## Repository & Resources

	For full code, training process, and additional details, visit the GitHub repository:

	🔗 [FinLLMOpt Repository](https://github.com/peremartra/FinLLMOpt)

	## How the Model was Created

	FinChat-XS was developed through a focused fine-tuning process designed to enhance financial domain expertise while maintaining conversational abilities:

	1. Base model selection: Started with SmolLM2-360M-Instruct, a lightweight instruction-tuned language model
	2. Dataset preparation:
	- Filtered the sujet-ai/Sujet-Finance-Instruct-177k dataset to focus on QA and conversational QA examples
	- Applied length filtering to keep responses below 500 characters
	- Augmented short conversational QA examples to improve conciseness

	3. Fine-tuning approach:
	- Applied LoRA (Low-Rank Adaptation) to efficiently fine-tune the model
	- Targeted key attention modules (q_proj, v_proj)
	- Used rank r=4 and alpha=16
	- Training configuration:
	- Batch size: 2 (effective batch size 16 with gradient accumulation)
	- Learning rate: 1.5e-4
	- BF16 precision

	## Challenges
	The primary challenge encountered during the development of FinChat-XS was the lack of high-quality conversational datasets specifically focused on personal finance. While the Sujet-Finance-Instruct-177k dataset provided valuable financial QA examples, there remains a notable gap in naturalistic, multi-turn conversations about personal financial scenarios.

	## Why Use This Model?

	FinChat-XS offers several advantages for specific use cases:

	- Efficient deployment: At only 362MB, it can run on devices with limited resources.
	- Financial domain knowledge: Fine-tuned specifically on financial QA data
	- Balanced communication style: Combines professional financial knowledge with conversational delivery
	- Low deployment cost: Requires significantly less computational resources than larger models
	- Customizable: The LoRA adapter can be mixed with other adapters or further fine-tuned

	Ideal for:
	- Embedded financial assistants in mobile apps
	- Personal financial planning tools
	- Educational applications about finance and investing
	- Customer service automation for financial institutions
	- Quick deployment scenarios where larger models aren't practical

	## How to Use the Model

	### Basic Usage with Transformers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	# Load model and tokenizer
	model_name = "oopere/FinChat-XS"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.float16)

	# Create a conversation
	messages = [
	{"role": "user", "content": "What's the difference between stocks and bonds?"}
	]

	# Format the prompt using the chat template
	prompt = tokenizer.apply_chat_template(messages, tokenize=False)

	# Tokenize the prompt
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	# Generate a response
	outputs = model.generate(
	**inputs,
	max_new_tokens=256,
	temperature=0.7,
	top_p=0.9,
	do_sample=True,
	repetition_penalty=1.2
	)

	# Decode and print the response
	response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
	print(response)
	```

	### Optimized Inference with 8-bit Quantization

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
	import torch

	# Configure 8-bit quantization
	bnb_config = BitsAndBytesConfig(
	load_in_8bit=True,
	bnb_4bit_compute_dtype=torch.float16
	)

	# Load model with quantization
	model = AutoModelForCausalLM.from_pretrained(
	"oopere/FinChat-XS",
	quantization_config=bnb_config,
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained("oopere/FinChat-XS")

	# Continue with the same usage pattern as above
	```

	### Using with LoRA Adapter Only

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel, PeftConfig

	# Load base model
	base_model = AutoModelForCausalLM.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct")
	tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct")

	# Load LoRA adapter
	peft_model = PeftModel.from_pretrained(base_model, "oopere/qa-adapterFinChat-XS")

	# Continue with the same usage pattern as above
	```

	## Limitations & Considerations

	While FinChat-XS performs well in many financial conversation scenarios, users should be aware of these limitations:

	1. Knowledge limitations: The model's knowledge is limited to its training data and has a knowledge cutoff date from the base model (SmolLM2).

	2. Size trade-offs: As a 360M parameter model, it has less capacity than larger models (7B+) and may provide less nuanced or detailed responses on complex topics.

	3. Financial advice disclaimer: The model is not a certified financial advisor and should not be used for making investment decisions. Its responses should be considered educational, not professional financial advice.

	4. Domain boundaries: While focused on finance, the model may struggle with highly specialized financial topics or recent developments not covered in its training data.

	5. Hallucination potential: Like all language models, FinChat-XS may occasionally generate plausible-sounding but incorrect information, especially when asked about specific numerical data or complex financial details.

	6. Style variations: The model balances formal financial knowledge with a conversational style, which may not be appropriate for all professional contexts.

	7. Regulatory compliance: This model has not been specifically audited for compliance with financial regulations in various jurisdictions.

	## Citation

	If you use FinChat-XS in your research or applications, please consider citing it as:

	```
	@misc{oopere2025finchatxs,
	author = {Martra, P.},
	title = {FinChat-XS: A Lightweight Financial Domain Chat Language Model},
	year = {2025},
	publisher = {Hugging Face},
	howpublished = {\url{https://huggingface.co/oopere/FinChat-XS}}
	}
	```

	## Acknowledgements

	- [HuggingFaceTB](https://huggingface.co/HuggingFaceTB) for creating the SmolLM2 model series
	- [Sujet AI](https://huggingface.co/sujet-ai) for their financial instruction dataset
	- [Hugging Face](https://huggingface.co/) for providing the infrastructure and tools for model development