|
--- |
|
license: apache-2.0 |
|
--- |
|
|
|
# haijian06/Yi-1.5-6B-Chat-Agent_sft |
|
|
|
## Overview |
|
|
|
The `haijian06/Yi-1.5-6B-Chat-Agent_sft` model is an advanced conversational agent built upon the Yi-1.5-6B-Chat model. This model has been fine-tuned to enhance its capabilities in handling agent tasks and function calls, making it a versatile tool for a variety of applications. |
|
|
|
## Features |
|
|
|
- **Improved Conversational Abilities**: Enhanced dialogue management and natural language understanding. |
|
- **Function Call Capability**: Supports complex function call operations, making it suitable for automation and task handling. |
|
- **High Performance**: Optimized for speed and accuracy in responses. |
|
|
|
## Installation |
|
|
|
To use this model, you need to have Python and the necessary libraries installed. You can install the required dependencies using the following commands: |
|
|
|
```bash |
|
pip install torch transformers |
|
``` |
|
|
|
## Usage |
|
|
|
Here is a basic example of how to use the `haijian06/Yi-1.5-6B-Chat-Agent_sft` model: |
|
|
|
```python |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
# Load the model and tokenizer |
|
model_name = "haijian06/Yi-1.5-6B-Chat-Agent_sft" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
|
# Generate a response |
|
input_text = "Hello, how can I assist you today?" |
|
input_ids = tokenizer.encode(input_text, return_tensors='pt') |
|
|
|
with torch.no_grad(): |
|
output = model.generate(input_ids, max_length=50) |
|
|
|
response = tokenizer.decode(output[0], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
## Fine-Tuning |
|
|
|
To fine-tune this model on your own dataset, follow these steps: |
|
|
|
1. Prepare your dataset in a suitable format. |
|
2. Use the `Trainer` class from the `transformers` library for training. |
|
|
|
Example training script: |
|
|
|
```python |
|
from transformers import Trainer, TrainingArguments |
|
|
|
training_args = TrainingArguments( |
|
output_dir='./results', |
|
num_train_epochs=3, |
|
per_device_train_batch_size=4, |
|
per_device_eval_batch_size=4, |
|
warmup_steps=500, |
|
weight_decay=0.01, |
|
logging_dir='./logs', |
|
) |
|
|
|
trainer = Trainer( |
|
model=model, |
|
args=training_args, |
|
train_dataset=train_dataset, |
|
eval_dataset=eval_dataset |
|
) |
|
|
|
trainer.train() |
|
``` |
|
|
|
## Contributing |
|
|
|
Contributions are welcome! Please fork this repository and submit a pull request with your improvements. |
|
|
|
## License |
|
|
|
This work is a derivative of Yi-1.5-6B by 01.AI, used under the Apache 2.0 License. |
|
|
|
|
|
## Acknowledgements |
|
|
|
This model is built upon the Yi-1.5-6B-Chat model. Special thanks to the developers and contributors of the original model. |
|
|
|
--- |
|
|
|
For more information, please visit our [GitHub repository](https://github.com/haijian06/Yi-1.5-6B-Chat-Agent_sft). |
|
|