File size: 9,921 Bytes
c2efe73
 
 
75f1bb1
18a24a7
75f1bb1
 
 
 
 
 
 
 
c98141d
3da9713
 
c98141d
 
 
 
 
75f1bb1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b03b56d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
---
license: llama2
---

# Trinity


![Trinity](https://huggingface.co/migtissera/Trinity-13B-v1.0/resolve/main/Trinity.png)


Trinity is a coding specific model series that can be used to create autonomous agents. In the future, we will be releasing agent software that uses this model.


# Our Offensive Cybersecurity Model WhiteRabbitNeo-33B model is now in beta!
Check out the Prompt Enhancing feature! Access at: https://www.whiterabbitneo.com/


# Join Our Discord Server
Join us at: https://discord.gg/8Ynkrcbk92 (Updated on Dec 29th. Now permanent link to join)

# Sample Inference Code
```
import torch, json
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "/home/migel/models/WhiteRabbitNeo"

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.float16,
    device_map="auto",
    load_in_4bit=False,
    load_in_8bit=True,
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)


def generate_text(instruction):
    tokens = tokenizer.encode(instruction)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to("cuda")

    instance = {
        "input_ids": tokens,
        "top_p": 1.0,
        "temperature": 0.5,
        "generate_len": 1024,
        "top_k": 50,
    }

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens,
            max_length=length + instance["generate_len"],
            use_cache=True,
            do_sample=True,
            top_p=instance["top_p"],
            temperature=instance["temperature"],
            top_k=instance["top_k"],
            num_return_sequences=1,
        )
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    answer = string.split("USER:")[0].strip()
    return f"{answer}"


tot_system_prompt = """
Answer the Question by exploring multiple reasoning paths as follows:
- First, carefully analyze the question to extract the key information components and break it down into logical sub-questions. This helps set up the framework for reasoning. The goal is to construct an internal search tree.
- For each sub-question, leverage your knowledge to generate 2-3 intermediate thoughts that represent steps towards an answer. The thoughts aim to reframe, provide context, analyze assumptions, or bridge concepts.
- Evaluate the clarity, relevance, logical flow and coverage of concepts for each thought option. Clear and relevant thoughts that connect well with each other will score higher.
- Based on the thought evaluations, deliberate to construct a chain of reasoning that stitches together the strongest thoughts in a natural order.
- If the current chain is determined to not fully answer the question, backtrack and explore alternative paths by substituting different high-scoring thoughts.
- Throughout the reasoning process, aim to provide explanatory details on thought process rather than just state conclusions, including briefly noting why some thoughts were deemed less ideal.
- Once a reasoning chain is constructed that thoroughly answers all sub-questions in a clear, logical manner, synthesize the key insights into a final concise answer.
- Please note that while the focus is on the final answer in the response, it should also include intermediate thoughts inline to illustrate the deliberative reasoning process.
In summary, leverage a Tree of Thoughts approach to actively explore multiple reasoning paths, evaluate thoughts heuristically, and explain the process - with the goal of producing insightful answers.
"""

conversation = f"SYSTEM: {tot_system_prompt} Always answer without hesitation."


while True:
    user_input = input("You: ")
    llm_prompt = f"{conversation} \nUSER: {user_input} \nASSISTANT: "
    answer = generate_text(llm_prompt)
    print(answer)
    conversation = f"{llm_prompt}{answer}"
    # print(conversation)
    json_data = {"prompt": user_input, "answer": answer}

    # print(json_data)
    # with open(output_file_path, "a") as output_file:
    #     output_file.write(json.dumps(json_data) + "\n")

```


# Sample Conversation:


You: Code me a Fast API server, with one endpoint. This endpoint returns files from a S3 bucket.

Here is how to code a Fast API server with one endpoint that returns files from an S3 bucket:                                                                                                          
                                                                                                                                                                                                       
```python                                                                                                                                                                                              
from fastapi import FastAPI                                                                                                                                                                            
from starlette.responses import StreamingResponse                                                                                                                                                      
import boto3                                                                                                                                                                                           
import botocore.exceptions                                                                                                                                                                             
                                                                                                                                                                                                       
app = FastAPI()                                                                                                                                                                                        
s3_client = boto3.client('s3')                                                                                                                                                                         
                                                                                                                                                                                                       
@app.get("/s3files")                                                                                                                                                                                   
async def get_s3_files(bucket_name: str, key: str):                                                                                                                                                    
    try:                                                                                                                                                                                               
        obj = s3_client.get_object(Bucket=bucket_name, Key=key)                                                                                                                                        
        return StreamingResponse(obj['Body'], media_type=obj['ContentType'])                                                                                                                           
    except botocore.exceptions.ClientError as e:                                                                                                                                                       
        return {"error": str(e)}, 404                                                                                                                                                                  
```

To explain:
- We initialize a FastAPI app and an S3 client 
- The `/s3files` endpoint takes the bucket name and object key as parameters
- We try to get the object from S3 using the S3 client
- If successful, we return the object's Body wrapped in a StreamingResponse, setting the ContentType
- If the client raises an error, we return a 404 with the error message

This allows clients to request files from an S3 bucket by specifying the bucket name and object key. The endpoint streams the file content back to the client.

To elaborate further:
- The endpoint can be extended to validate inputs, handle authentication, logging, etc. 
- We could make this endpoint more generic by accepting a list of object keys instead of requiring a single key.
- There are several ways to optimize performance, like caching responses.

Let me know if you have any other questions!

You: How do I authenticate s3?

Here are a few ways to authenticate with S3:

1. Credentials - You can provide AWS access key ID and secret access key to boto3. This will sign requests automatically.

```python
import boto3 

s3 = boto3.client('s3', 
                 aws_access_key_id='YOUR_ACCESS_KEY',
                 aws_secret_access_key='YOUR_SECRET_KEY')
```

2. IAM roles - If you are running on EC2 with an IAM role, you don't need to provide credentials. Boto3 will automatically use the role.

3. Environment variables - Store credentials in environment variables and read them in boto3.

4. AWS credentials file - Store credentials in ~/.aws/credentials. Boto3 will automatically use them.

5. STS Assume roles - Get temporary credentials from STS and assume a role to access S3. 

6. AWS SDK credential providers - Use environment variables, configuration files, etc to load credentials.

So in summary, the best way depends on your deployment environment. The best practice is to use IAM roles when possible. Other options like environment variables and credential files can be useful for testing.

Let me know if you have any other questions!