BSC-LT
/

salamandra7b_rag_prompt_ca-en-es

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

salamandra7b_rag_prompt_ca-en-es / README.md

ankush13r's picture

Update README.md

c305c74 verified 4 months ago

|

history blame contribute delete

1.99 kB

	---
	licence: apache-2.0
	library_name: transformers
	pipeline_tag: text-generation
	language:
	- bg
	- ca
	- code
	- cs
	- cy
	- da
	- de
	- el
	- en
	- es
	- et
	- eu
	- fi
	- fr
	- ga
	- gl
	- hr
	- hu
	- it
	- lt
	- lv
	- mt
	- nl
	- nn
	- no
	- oc
	- pl
	- pt
	- ro
	- ru
	- sh
	- sk
	- sl
	- sr
	- sv
	- uk
	---

	## How to use

	This instructed model uses a chat template that must be adhered to the input for conversational use.
	The easiest way to apply it is using the tokenizer's built-in chat template, as shown in the following snippet.

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import transformers
	import torch

	model_id = "BSC-LT/salamandra7b_rag_prompt_ca-en-es"

	prompt = "Here is a question that you should answer based on the given context. Write a response that answers the question using only information provided in the context. Provide the answer in Spanish."

	context = """Water boils at 100°C (212°F) at standard atmospheric pressure, which is at sea level.
	However, this boiling point can vary depending on altitude and atmospheric pressure.
	At higher altitudes, where atmospheric pressure is lower, water boils at a lower temperature.
	For example, at 2,000 meters (about 6,600 feet) above sea level, water boils at around 93°C (199°F).
	"""
	instruction = "At what temperature does water boil?"

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	device_map="cuda",
	torch_dtype=torch.bfloat16
	)

	content = f"{prompt}\n\nContext:\n{context}\n\nQuestion:\n{instruction}"
	chat = [ { "role": "user", "content": content } ]

	prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)

	eos_tokens = [
	tokenizer.eos_token_id,
	tokenizer.convert_tokens_to_ids("<\|im_end\|>"),
	]

	inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
	outputs = model.generate(input_ids=inputs.to(model.device), eos_token_id=eos_tokens, max_new_tokens=200)

	```