pankajmathur
/

orca_mini_v3_13b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

orca_mini_v3_13b / README.md

Pankaj Mathur

Update README.md

7c27b17 about 1 year ago

|

3.48 kB

	---
	language:
	- en
	library_name: transformers
	---

	# orca_mini_v3_13b

	A Llama2-13b model trained on Orca Style datasets.

	I am actively seeking sponsorship and partnership opportunities. If you're interested, please connect with me at www.linkedin.com/in/pankajam.

	## Evaluation

	We evaluated orca_mini_v3_13b on a wide range of tasks using [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) from EleutherAI.

	Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)

	\|\|\|\|\|
	\|:------:\|:--------:\|:-------:\|:--------:\|
	\|Task\|Metric\|Value\|Stderr\|
	\|arc_challenge\|acc_norm\|0.6314\|0.0141\|
	\|hellaswag\|acc_norm\|0.8242\|0.0038\|
	\|mmlu\|acc_norm\|0.5637\|0.0351\|
	\|truthfulqa_mc\|mc2\|0.5127\|0.0157\|
	\|Total Average\|-\|0.6329877193\|\|


	## Example Usage

	Here is the prompt format

	```
	### System:
	You are an AI assistant that follows instruction extremely well. Help as much as you can.

	### User:
	Tell me about Orcas.

	### Assistant:

	```

	Below shows a code example on how to use this model

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

	tokenizer = AutoTokenizer.from_pretrained("psmathur/orca_mini_v3_13b", use_fast=False)
	model = AutoModelForCausalLM.from_pretrained("psmathur/orca_mini_v3_13b", torch_dtype=torch.float16, low_cpu_mem_usage=True, device_map="auto")
	system_prompt = "### System:\nYou are an AI assistant that follows instruction extremely well. Help as much as you can.\n\n"

	#generate text steps
	instruction = "Tell me about Orcas."
	prompt = f"{system_prompt}### User: {instruction}\n\n### Assistant:\n"
	inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
	output = model.generate(**inputs, do_sample=True, top_p=0.95, top_k=0, max_new_tokens=4096)

	print(tokenizer.decode(output[0], skip_special_tokens=True))

	```


	#### Limitations & Biases:

	While this model aims for accuracy, it can occasionally produce inaccurate or misleading results.

	Despite diligent efforts in refining the pretraining data, there remains a possibility for the generation of inappropriate, biased, or offensive content.

	Exercise caution and cross-check information when necessary.



	### Citiation:

	Please kindly cite using the following BibTeX:

	```
	@misc{orca_mini_v3_13b,
	author = {Pankaj Mathur},
	title = {orca_mini_v3_13b: An explain tuned Llama2-13b model},
	year = {2023},
	publisher = {GitHub, HuggingFace},
	journal = {GitHub repository, HuggingFace repository},
	howpublished = {\url{https://https://huggingface.co/psmathur/orca_mini_v3_13b},
	}
	```

	```
	@misc{mukherjee2023orca,
	title={Orca: Progressive Learning from Complex Explanation Traces of GPT-4},
	author={Subhabrata Mukherjee and Arindam Mitra and Ganesh Jawahar and Sahaj Agarwal and Hamid Palangi and Ahmed Awadallah},
	year={2023},
	eprint={2306.02707},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```

	```
	@software{touvron2023llama,
	title={LLaMA: Open and Efficient Foundation Language Models},
	author={Touvron, Hugo and Lavril, Thibaut and Izacard, Gautier and Martinet, Xavier and Lachaux, Marie-Anne and Lacroix, Timoth{\'e}e and Rozi{\`e}re, Baptiste and Goyal, Naman and Hambro, Eric and Azhar, Faisal and Rodriguez, Aurelien and Joulin, Armand and Grave, Edouard and Lample, Guillaume},
	journal={arXiv preprint arXiv:2302.13971},
	year={2023}
	}
	```