Update app.py
Browse files
app.py
CHANGED
@@ -50,9 +50,9 @@ def get_hidden_states(raw_original_prompt):
|
|
50 |
outputs = model(**model_inputs, output_hidden_states=True, return_dict=True)
|
51 |
hidden_states = torch.stack([h.squeeze(0).cpu().detach() for h in outputs.hidden_states], dim=0)
|
52 |
with gr.Row() as tokens_container:
|
53 |
-
for token in tokens:
|
54 |
-
|
55 |
-
return
|
56 |
|
57 |
|
58 |
def run_model(raw_original_prompt, raw_interpretation_prompt, max_new_tokens, do_sample,
|
@@ -115,8 +115,6 @@ with gr.Blocks(theme=gr.themes.Default()) as demo:
|
|
115 |
This idea was explored in the paper **Patchscopes** ([Ghandeharioun et al., 2024](https://arxiv.org/abs/2401.06102)) and was later investigated further in **SelfIE** ([Chen et al., 2024](https://arxiv.org/abs/2403.10949)).
|
116 |
An honorary mention for **Speaking Probes** ([Dar, 2023](https://towardsdatascience.com/speaking-probes-self-interpreting-models-7a3dc6cb33d6) -- my post!! 🥳) which was a less mature approach but with the same idea in mind.
|
117 |
We follow the SelfIE implementation in this space for concreteness. Patchscopes are so general that they encompass many other interpretation techniques too!!!
|
118 |
-
|
119 |
-
|
120 |
|
121 |
👾 **The idea is really simple: models are able to understand their own hidden states by nature!** 👾
|
122 |
If I give a model a prompt of the form ``User: [X] Assistant: Sure'll I'll repeat your message`` and replace ``[X]`` *during computation* with the hidden state we want to understand,
|
@@ -146,7 +144,7 @@ with gr.Blocks(theme=gr.themes.Default()) as demo:
|
|
146 |
interpretation_prompt = gr.Text(suggested_interpretation_prompts[0], label='Interpretation Prompt')
|
147 |
|
148 |
with gr.Group('Output'):
|
149 |
-
tokens_container = gr.
|
150 |
with gr.Column() as interpretations_container:
|
151 |
pass
|
152 |
|
|
|
50 |
outputs = model(**model_inputs, output_hidden_states=True, return_dict=True)
|
51 |
hidden_states = torch.stack([h.squeeze(0).cpu().detach() for h in outputs.hidden_states], dim=0)
|
52 |
with gr.Row() as tokens_container:
|
53 |
+
# for token in tokens:
|
54 |
+
# gr.Button(token)
|
55 |
+
return str(tokens)
|
56 |
|
57 |
|
58 |
def run_model(raw_original_prompt, raw_interpretation_prompt, max_new_tokens, do_sample,
|
|
|
115 |
This idea was explored in the paper **Patchscopes** ([Ghandeharioun et al., 2024](https://arxiv.org/abs/2401.06102)) and was later investigated further in **SelfIE** ([Chen et al., 2024](https://arxiv.org/abs/2403.10949)).
|
116 |
An honorary mention for **Speaking Probes** ([Dar, 2023](https://towardsdatascience.com/speaking-probes-self-interpreting-models-7a3dc6cb33d6) -- my post!! 🥳) which was a less mature approach but with the same idea in mind.
|
117 |
We follow the SelfIE implementation in this space for concreteness. Patchscopes are so general that they encompass many other interpretation techniques too!!!
|
|
|
|
|
118 |
|
119 |
👾 **The idea is really simple: models are able to understand their own hidden states by nature!** 👾
|
120 |
If I give a model a prompt of the form ``User: [X] Assistant: Sure'll I'll repeat your message`` and replace ``[X]`` *during computation* with the hidden state we want to understand,
|
|
|
144 |
interpretation_prompt = gr.Text(suggested_interpretation_prompts[0], label='Interpretation Prompt')
|
145 |
|
146 |
with gr.Group('Output'):
|
147 |
+
tokens_container = gr.Text()
|
148 |
with gr.Column() as interpretations_container:
|
149 |
pass
|
150 |
|