Update README.md

a1bb277 verified over 1 year ago

4.23 kB

	---
	library_name: transformers
	datasets:
	- web_questions
	metrics:
	- perplexity
	---

	# Model Card for Model ID

	This model card corresponds to the 7B instruct finetuned version of the Gemma model.



	## Model Details
	This is a general question-answer model finetuned on the web_questions dataset.

	### Model Description

	This is a general question-answer LLM finetuned using Gemma on top of web_questions dataset.
	Gemma is a family of lightweight, state-of-the-art open models from Google,
	built from the same research and technology used to create the Gemini models.
	They are text-to-text, decoder-only large language models, available in English,
	with open weights, pre-trained variants, and instruction-tuned variants. Gemma
	models are well-suited for a variety of text generation tasks, including
	question answering, summarization, and reasoning. Their relatively small size
	makes it possible to deploy them in environments with limited resources such as
	a laptop, desktop or your own cloud infrastructure, democratizing access to
	state of the art AI models and helping foster innovation for everyone.

	- Developed by: Geerath Bhat
	- Model type: Fine-tuned Instruct LLM.
	- Language(s) (NLP): English
	- License: No
	- Finetuned from model: [google/gemma-7b-it]

	### Usage

	Google/Gemma has shared some code snippets on how to get quickly started with running the model. First make sure to `pip install -U transformers`, then copy the snippet from the section that is relevant for your usecase.

	hf_model_repo = Geerath/google-gemma-7b-it-finetuned-web-questions

	# Get the tokenizer
	tokenizer = AutoTokenizer.from_pretrained(hf_model_repo)

	# Load the model


	model = AutoModelForCausalLM.from_pretrained(hf_model_repo,
	quantization_config=bnb_config,
	device_map="auto")

	prompt = ["Question: Tell me something about IISc\n\nAnswer:\n"]

	# Generate response
	%%time
	input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids
	outputs = model.generate(input_ids=input_ids,
	max_new_tokens=200,
	do_sample = True,
	temperature=0.2)

	result = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]

	result = "Question:"+result.split("Question:")[1]

	# Print the result
	print(f"Generated response:\n{result}")
	#### Fine-tuning the model

	You can find fine-tuning scripts and notebook under the [`examples/` directory](https://huggingface.co/google/gemma-7b/tree/main/examples) of [`google/gemma-7b`](https://huggingface.co/google/gemma-7b) repository. To adapt it to this model, simply change the model-id to `google/gemma-7b-it`.
	In that repository, we provide:

	* A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA
	* A script to perform SFT using FSDP on TPU devices
	* A notebook that you can run on a free-tier Google Colab instance to perform SFT on English quotes dataset


	## How to Get Started with the Model

	Use the code provided by google/gemma-7b-it to get started with this finetuned model.

	## Training Details


	### Training Data

	web_questions

	### Training Procedure

	Trained using SFTTrainer and below are the TrainingArguments.

	num_train_epochs=1, # adjust based on the data size
	per_device_train_batch_size=4, # use 2 or 4 if you have less GPU RAM
	per_device_eval_batch_size=4,
	optim="paged_adamw_32bit",
	#gradient_accumulation_steps=2,
	save_strategy="epoch",
	evaluation_strategy="epoch",
	learning_rate=2e-4,
	logging_steps=1,
	fp16=True,
	weight_decay=0.01,
	lr_scheduler_type="cosine",
	seed=42,

	## Evaluation

	Evaluated on test set of the web_questions dataset.

	#### Testing Data

	Currently tested on test set of web_questions dataset and will update soon the testing results with respect to other datasets. Thank you!!!

	#### Metrics

	Perplexity
	Accuracy
	F1 Score

	### Results

	After 2 epochs the training loss was 1.114500 and validation loss was 1.592121.

	Perplexity on test data from web_questions dataset: 5.13