UniverseTBD
/

astrollama

Text Generation

text-generation-inference

Model card Files Files and versions Community

joshnguyen commited on Aug 25, 2023

Commit

a6f4193

·

1 Parent(s): 3ee1192

Update README.md

Files changed (1) hide show

README.md +23 -0

README.md CHANGED Viewed

@@ -12,6 +12,7 @@ tags:
 - llama-2
 - astronomy
 - astrophysics
 ---
 <p><h1>AstroLLaMA</h1></p>
@@ -62,3 +63,25 @@ generated_text = generator(
     max_length=512
 )
 ```

 - llama-2
 - astronomy
 - astrophysics
+- arxiv
 ---
 <p><h1>AstroLLaMA</h1></p>
     max_length=512
 )
 ```
+## Embedding text with AstroLLaMA
+```
+texts = [
+    "Abstract 1",
+    "Abstract 2"
+]
+inputs = tokenizer(
+    text_batch,
+    return_tensors="pt",
+    return_token_type_ids=False,
+    padding=True,
+    truncation=True,
+    max_length=4096
+)
+inputs.to(model.device)
+outputs = model(**inputs, output_hidden_states=True)
+# Last layer of the hidden states. Get the embedding of the first token in each sequence
+embeddings = outputs["hidden_states"][-1][:, 0, ...].detach().cpu().numpy()
+```