rohansolo
/

DeciDPObyBB

Text Generation

Model card Files Files and versions Community

DeciDPObyBB / README.md

rohansolo's picture

Update README.md

81b6112 about 1 year ago

|

history blame contribute delete

1.47 kB

	---
	license: apache-2.0
	datasets:
	- Intel/orca_dpo_pairs
	pipeline_tag: text-generation
	---

	# DeciDPObyBB - a 7b DeciLM Finetune using DPO

	Built by fine-tuning [DeciLM-7B-Insruct](https://huggingface.co/Deci/DeciLM-7B-instruct) using [Intel Orca DPO Pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs)

	created by [bhaiyabot](bhaiyabot.in)

	built for research and learning purposes!

	usage:

	```
	message = [
	{"role": "system", "content": "You are a very helpful assistant chatbot that thinks step by step"},
	{"role": "user", "content": input}
	]
	tokenizer = AutoTokenizer.from_pretrained(new_model)
	prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)


	sequences = pipeline(
	prompt,
	do_sample=True,
	temperature=1,
	num_beams=5,
	max_length=1000,
	pad_token_id=tokenizer.eos_token_id,
	)
	print(sequences[0]['generated_text'])
	```

	```bibtex
	@misc{DeciFoundationModels,
	title = {DeciLM-7B-instruct},
	author = {DeciAI Research Team},
	year = {2023}
	url={https://huggingface.co/Deci/DeciLM-7B-instruct},
	}

	@misc{rafailov2023direct,
	title={Direct Preference Optimization: Your Language Model is Secretly a Reward Model},
	author={Rafael Rafailov and Archit Sharma and Eric Mitchell and Stefano Ermon and Christopher D. Manning and Chelsea Finn},
	year={2023},
	eprint={2305.18290},
	archivePrefix={arXiv},
	primaryClass={cs.LG}
	}
	```



	more details to come soon