Mpt-7B-Assistant

License: Apache v2

Mpt-7B-Assistant is an AI assistant built using Flax/JAX and trained on Cloud TPUs. The model has a context length of 5144 and 7B parameters, making it suitable for a wide range of natural language processing tasks.

Usage

Once you have installed the repository, you can start using the model by importing it into your Python code:

from transformers import AutoTokenizer, FlaxAutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("huggingface/Mpt-7B-Assistant")
model = FlaxAutoModelForCausalLM.from_pretrained("huggingface/Mpt-7B-Assistant")

prompt = "<|endoftext|><|prompter|>Hello, how are you today?<|endoftext|><|assistant|>"
input_ids = tokenizer.encode(prompt, return_tensors="jax")

output_ids = model.generate(input_ids, max_length=100)
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(output_text)

This will generate a response from the model based on the input prompt. You can customize the length of the generated output by changing the max_length parameter in the generate() method.

Training

The Mpt-7B-Assistant model was trained using EasyDel and OST-OpenSourceTransformers library, which provides an easy-to-use interface for training large language models on Cloud TPUs.

This will train the model on the specified dataset using Cloud TPUs, and save the trained model to the specified output directory.

License

Mpt-7B-Assistant is licensed under the Apache v2 License. See LICENSE for more information.

Downloads last month
12
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.

Datasets used to train erfanzar/Mpt-7B-Assistant