Spaces:

Ahmadzei
/

RAG

Runtime error

added 3 more tables for large emb model

5fa1a76 over 1 year ago

497 Bytes

	language_id = tokenizer.lang2id["en"] # 0
	langs = torch.tensor([language_id] * input_ids.shape[1]) # torch.tensor([0, 0, 0, , 0])
	We reshape it to be of size (batch_size, sequence_length)
	langs = langs.view(1, -1) # is now of shape [1, sequence_length] (we have a batch size of 1)

	Now you can pass the input_ids and language embedding to the model:

	outputs = model(input_ids, langs=langs)

	The run_generation.py script can generate text with language embeddings using the xlm-clm checkpoints.