Geerath's picture
Update README.md
a1bb277 verified
---
library_name: transformers
datasets:
- web_questions
metrics:
- perplexity
---
# Model Card for Model ID
This model card corresponds to the 7B instruct finetuned version of the Gemma model.
## Model Details
This is a general question-answer model finetuned on the web_questions dataset.
### Model Description
This is a general question-answer LLM finetuned using Gemma on top of web_questions dataset.
Gemma is a family of lightweight, state-of-the-art open models from Google,
built from the same research and technology used to create the Gemini models.
They are text-to-text, decoder-only large language models, available in English,
with open weights, pre-trained variants, and instruction-tuned variants. Gemma
models are well-suited for a variety of text generation tasks, including
question answering, summarization, and reasoning. Their relatively small size
makes it possible to deploy them in environments with limited resources such as
a laptop, desktop or your own cloud infrastructure, democratizing access to
state of the art AI models and helping foster innovation for everyone.
- **Developed by:** Geerath Bhat
- **Model type:** Fine-tuned Instruct LLM.
- **Language(s) (NLP):** English
- **License:** No
- **Finetuned from model:** [google/gemma-7b-it]
### Usage
Google/Gemma has shared some code snippets on how to get quickly started with running the model. First make sure to `pip install -U transformers`, then copy the snippet from the section that is relevant for your usecase.
hf_model_repo = Geerath/google-gemma-7b-it-finetuned-web-questions
# Get the tokenizer
tokenizer = AutoTokenizer.from_pretrained(hf_model_repo)
# Load the model
model = AutoModelForCausalLM.from_pretrained(hf_model_repo,
quantization_config=bnb_config,
device_map="auto")
prompt = ["Question: Tell me something about IISc\n\nAnswer:\n"]
# Generate response
%%time
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids
outputs = model.generate(input_ids=input_ids,
max_new_tokens=200,
do_sample = True,
temperature=0.2)
result = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
result = "Question:"+result.split("Question:")[1]
# Print the result
print(f"Generated response:\n{result}")
#### Fine-tuning the model
You can find fine-tuning scripts and notebook under the [`examples/` directory](https://huggingface.co/google/gemma-7b/tree/main/examples) of [`google/gemma-7b`](https://huggingface.co/google/gemma-7b) repository. To adapt it to this model, simply change the model-id to `google/gemma-7b-it`.
In that repository, we provide:
* A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA
* A script to perform SFT using FSDP on TPU devices
* A notebook that you can run on a free-tier Google Colab instance to perform SFT on English quotes dataset
## How to Get Started with the Model
Use the code provided by google/gemma-7b-it to get started with this finetuned model.
## Training Details
### Training Data
web_questions
### Training Procedure
Trained using SFTTrainer and below are the TrainingArguments.
num_train_epochs=1, # adjust based on the data size
per_device_train_batch_size=4, # use 2 or 4 if you have less GPU RAM
per_device_eval_batch_size=4,
optim="paged_adamw_32bit",
#gradient_accumulation_steps=2,
save_strategy="epoch",
evaluation_strategy="epoch",
learning_rate=2e-4,
logging_steps=1,
fp16=True,
weight_decay=0.01,
lr_scheduler_type="cosine",
seed=42,
## Evaluation
Evaluated on test set of the web_questions dataset.
#### Testing Data
Currently tested on test set of web_questions dataset and will update soon the testing results with respect to other datasets. Thank you!!!
#### Metrics
Perplexity
Accuracy
F1 Score
### Results
After 2 epochs the training loss was 1.114500 and validation loss was 1.592121.
Perplexity on test data from web_questions dataset: 5.13