language:
- vi
library_name: transformers
tags:
- LLMs
- NLP
- Vietnamese
license: mit
Model Card for Model ID
Chatbots can be programmed with a large knowledge base on answer users' questions on a variety of topics. They can provide facts, data, explanations, definitions, etc. Complete tasks. Chatbots can be integrated with other systems and APIs to actually do things for users. Based on a user's preferences and past interactions, chatbots can suggest products, services, content and more that might be relevant and useful to the user. Provide customer service. Chatbots can handle many simple customer service interactions to answer questions, handle complaints, process returns, etc. This allows human agents to focus on more complex issues. Generate conversational responses - Using NLP and machine learning, chatbots can understand natural language and generate conversational responses, creating fluent interactions.
Model Details
Model Description
- Model type: Mistral
- Language(s) (NLP): Vietnamese
- Finetuned from model : Viet-Mistral/Vistral-7B-Chat
Purpose
This model is a improve from the old one. It's have the new the tokenizer_config.json to use <|im_start|> and <|im_end|> as the additional special tokens.
Training Data
Our dataset was make base on our university sudent notebook. It includes majors, university regulations and other information about our university.
hcmue_qa
Training Procedure
# Load LoRA configuration
peft_config = LoraConfig(
r=8,
lora_alpha=16,
target_modules=[
"q_proj",
"k_proj",
"v_proj",
"o_proj",
"gate_proj",
"up_proj",
"down_proj",
"lm_head",
],
bias="none",
lora_dropout=0.05, # Conventional
task_type="CAUSAL_LM",
)
#update newchat template
"chat_template": "{% for message in messages %}{{'<|im_start|>' + message['role'] + '\n' + message['content'] + '<|im_end|>' + '\n'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant\n' }}{% endif %}",
"clean_up_tokenization_spaces": false,
"eos_token": "<|im_end|>",
"legacy": true,
"model_max_length": 1000000000000000019884624838656,
"pad_token": "<unk>",
"sp_model_kwargs": {},
"spaces_between_special_tokens": false,
"tokenizer_class": "LlamaTokenizer",
"unk_token": "<unk>",
"use_default_system_prompt": false,
"use_fast": true
}