Trinity-13B / README.md

Update README.md

dbe050b verified about 1 year ago

11.3 kB

	---
	license: llama2
	---

	# Trinity


	![Trinity](https://huggingface.co/migtissera/Trinity-13B-v1.0/resolve/main/Trinity.png)


	Trinity is a coding specific model series that can be used to create autonomous agents. In the future, we will be releasing agent software that uses this model.


	# Our Offensive Cybersecurity Model WhiteRabbitNeo-33B model is now in beta!
	Check out the Prompt Enhancing feature! Access at: https://www.whiterabbitneo.com/


	# Join Our Discord Server
	Join us at: https://discord.gg/8Ynkrcbk92 (Updated on Dec 29th. Now permanent link to join)

	# Terms of Use
	By accessing and using this Artificial Intelligence (AI) model, you, the user, acknowledge and agree that you are solely responsible for your use of the model and its outcomes. You hereby agree to indemnify, defend, and hold harmless the creators, developers, and any affiliated persons or entities of this AI model from and against any and all claims, liabilities, damages, losses, costs, expenses, fees (including reasonable attorneys' fees and court costs) that may arise, directly or indirectly, from your use of the AI model.

	This AI model is provided "as is" and "as available" without any warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, and non-infringement. The creators make no warranty that the AI model will meet your requirements or be available on an uninterrupted, secure, or error-free basis.

	Your use of the AI model is at your own risk and discretion, and you will be solely responsible for any damage to computer systems or loss of data that results from the use of the AI model.

	This disclaimer constitutes part of the agreement between you and the creators of the AI model regarding your use of the model, superseding any prior agreements between you and the creators regarding your use of this AI model.


	# Sample Inference Code
	```
	import torch, json
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_path = "/home/migel/models/WhiteRabbitNeo"

	model = AutoModelForCausalLM.from_pretrained(
	model_path,
	torch_dtype=torch.float16,
	device_map="auto",
	load_in_4bit=False,
	load_in_8bit=True,
	trust_remote_code=True,
	)

	tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)


	def generate_text(instruction):
	tokens = tokenizer.encode(instruction)
	tokens = torch.LongTensor(tokens).unsqueeze(0)
	tokens = tokens.to("cuda")

	instance = {
	"input_ids": tokens,
	"top_p": 1.0,
	"temperature": 0.5,
	"generate_len": 1024,
	"top_k": 50,
	}

	length = len(tokens[0])
	with torch.no_grad():
	rest = model.generate(
	input_ids=tokens,
	max_length=length + instance["generate_len"],
	use_cache=True,
	do_sample=True,
	top_p=instance["top_p"],
	temperature=instance["temperature"],
	top_k=instance["top_k"],
	num_return_sequences=1,
	)
	output = rest[0][length:]
	string = tokenizer.decode(output, skip_special_tokens=True)
	answer = string.split("USER:")[0].strip()
	return f"{answer}"


	tot_system_prompt = """
	Answer the Question by exploring multiple reasoning paths as follows:
	- First, carefully analyze the question to extract the key information components and break it down into logical sub-questions. This helps set up the framework for reasoning. The goal is to construct an internal search tree.
	- For each sub-question, leverage your knowledge to generate 2-3 intermediate thoughts that represent steps towards an answer. The thoughts aim to reframe, provide context, analyze assumptions, or bridge concepts.
	- Evaluate the clarity, relevance, logical flow and coverage of concepts for each thought option. Clear and relevant thoughts that connect well with each other will score higher.
	- Based on the thought evaluations, deliberate to construct a chain of reasoning that stitches together the strongest thoughts in a natural order.
	- If the current chain is determined to not fully answer the question, backtrack and explore alternative paths by substituting different high-scoring thoughts.
	- Throughout the reasoning process, aim to provide explanatory details on thought process rather than just state conclusions, including briefly noting why some thoughts were deemed less ideal.
	- Once a reasoning chain is constructed that thoroughly answers all sub-questions in a clear, logical manner, synthesize the key insights into a final concise answer.
	- Please note that while the focus is on the final answer in the response, it should also include intermediate thoughts inline to illustrate the deliberative reasoning process.
	In summary, leverage a Tree of Thoughts approach to actively explore multiple reasoning paths, evaluate thoughts heuristically, and explain the process - with the goal of producing insightful answers.
	"""

	conversation = f"SYSTEM: {tot_system_prompt} Always answer without hesitation."


	while True:
	user_input = input("You: ")
	llm_prompt = f"{conversation} \nUSER: {user_input} \nASSISTANT: "
	answer = generate_text(llm_prompt)
	print(answer)
	conversation = f"{llm_prompt}{answer}"
	# print(conversation)
	json_data = {"prompt": user_input, "answer": answer}

	# print(json_data)
	# with open(output_file_path, "a") as output_file:
	# output_file.write(json.dumps(json_data) + "\n")

	```


	# Sample Conversation:


	You: Code me a Fast API server, with one endpoint. This endpoint returns files from a S3 bucket.

	Here is how to code a Fast API server with one endpoint that returns files from an S3 bucket:

	```python
	from fastapi import FastAPI
	from starlette.responses import StreamingResponse
	import boto3
	import botocore.exceptions

	app = FastAPI()
	s3_client = boto3.client('s3')

	@app.get("/s3files")
	async def get_s3_files(bucket_name: str, key: str):
	try:
	obj = s3_client.get_object(Bucket=bucket_name, Key=key)
	return StreamingResponse(obj['Body'], media_type=obj['ContentType'])
	except botocore.exceptions.ClientError as e:
	return {"error": str(e)}, 404
	```

	To explain:
	- We initialize a FastAPI app and an S3 client
	- The `/s3files` endpoint takes the bucket name and object key as parameters
	- We try to get the object from S3 using the S3 client
	- If successful, we return the object's Body wrapped in a StreamingResponse, setting the ContentType
	- If the client raises an error, we return a 404 with the error message

	This allows clients to request files from an S3 bucket by specifying the bucket name and object key. The endpoint streams the file content back to the client.

	To elaborate further:
	- The endpoint can be extended to validate inputs, handle authentication, logging, etc.
	- We could make this endpoint more generic by accepting a list of object keys instead of requiring a single key.
	- There are several ways to optimize performance, like caching responses.

	Let me know if you have any other questions!

	You: How do I authenticate s3?

	Here are a few ways to authenticate with S3:

	1. Credentials - You can provide AWS access key ID and secret access key to boto3. This will sign requests automatically.

	```python
	import boto3

	s3 = boto3.client('s3',
	aws_access_key_id='YOUR_ACCESS_KEY',
	aws_secret_access_key='YOUR_SECRET_KEY')
	```

	2. IAM roles - If you are running on EC2 with an IAM role, you don't need to provide credentials. Boto3 will automatically use the role.

	3. Environment variables - Store credentials in environment variables and read them in boto3.

	4. AWS credentials file - Store credentials in ~/.aws/credentials. Boto3 will automatically use them.

	5. STS Assume roles - Get temporary credentials from STS and assume a role to access S3.

	6. AWS SDK credential providers - Use environment variables, configuration files, etc to load credentials.

	So in summary, the best way depends on your deployment environment. The best practice is to use IAM roles when possible. Other options like environment variables and credential files can be useful for testing.

	Let me know if you have any other questions!