theminji
/

TinyLlama-v2ray

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions Metrics Training metrics Community

TinyLlama-v2ray / README.md

Jeff man112

Update README.md

e4a5e20 over 1 year ago

|

history blame contribute delete

2.48 kB

	---
	license: apache-2.0
	base_model: TinyLlama/TinyLlama-1.1B-Chat-v0.6
	tags:
	- trl
	- sft
	- generated_from_trainer
	model-index:
	- name: TinyLlama-v2ray
	results: []
	datasets:
	- TheBossLevel123/v2ray
	library_name: transformers
	widget:
	- text: "<\|im_start\|>user\nWho are you?<\|im_end\|>\n<\|im_start\|>assistant"
	example_title: "First Example"
	- text: "<\|im_start\|>user\nhow much do you goon?<\|im_end\|>\n<\|im_start\|>assistant"
	example_title: "Second Example"
	---


	# TinyLlama-v2ray

	This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-Chat-v0.6](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v0.6) on the [TheBossLevel123/v2ray](https://huggingface.co/datasets/TheBossLevel123/v2ray) dataset.

	## Model description
	Prompt format is as follows:
	```py
	<\|im_start\|>user
	{prompt}<\|im_end\|>
	<\|im_start\|>assistant
	```

	The model is intended to mimic the behavior of v2ray, so results will most likely be nonsensical or gibberish.

	## Example Usage
	```py
	import torch
	from transformers import pipeline, AutoTokenizer
	import re
	tokenizer = AutoTokenizer.from_pretrained("TheBossLevel123/TinyLlama-v2ray")
	pipe = pipeline("text-generation", model="TheBossLevel123/TinyLlama-v2ray", torch_dtype=torch.bfloat16, device_map="auto")

	def formatted_prompt(prompt)-> str:
	return f"<\|im_start\|>user\n{prompt}<\|im_end\|>\n<\|im_start\|>assistant"

	def extract_text(text):
	pattern = r'v2ray\n(.*?)(?=<\\|im_end\\|>)'
	match = re.search(pattern, text, re.DOTALL)
	if match:
	return f"Output: {match.group(1)}"
	else:
	return "No match found"
	prompt = 'what are your thoughts on ccp'
	outputs = pipe(formatted_prompt(prompt), max_new_tokens=50, do_sample=True, temperature=0.9)
	if outputs and "generated_text" in outputs[0]:
	text = extract_text(outputs[0]["generated_text"])
	print(f"Prompt: {prompt}")
	print("")
	print(text)
	else:
	print("No output or unexpected structure")

	#Prompt: what are ur thoughts on ccp
	#
	#Output: <Re: insaneness> you are a ccp
	```

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.002
	- train_batch_size: 1
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 32
	- total_train_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- training_steps: 1000
	- mixed_precision_training: Native AMP

	### Framework versions

	- Transformers 4.35.2
	- Pytorch 2.1.0+cu121
	- Datasets 2.16.0
	- Tokenizers 0.15.0