en to zh not work
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
model = AutoModelWithLMHead.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
translation = pipeline("translation_en_to_zh", model=model, tokenizer=tokenizer)
#translation = pipeline("translation", model="Helsinki-NLP/opus-mt-en-zh")
text = "hello"
result = translation(text, max_length=40)[0]["translation_text"]
result is εεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεε
transformers 4.31.0
Check this link: https://huggingface.co/docs/transformers/model_doc/marian
from transformers import MarianMTModel, MarianTokenizer
src_text = [
'Hello, Good to see you.',
"It's a beautiful day!",
'Good moods are the most important.',
]
model_name = "Helsinki-NLP/opus-mt-en-zh"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)
translated = model.generate(**tokenizer(src_text, return_tensors="pt", padding=True))
res = [tokenizer.decode(t, skip_special_tokens=True) for t in translated]
print(res)
the result is:
['δ½ ε₯½,εΎι«ε ΄θ§ε°δ½ γ', 'θΏζ―δΈδΈͺηΎδΈ½ηδΈε€©!', 'θ―ε₯½ηζ η»ͺζ―ζιθ¦ηγ']
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
model = AutoModelWithLMHead.from_pretrained("Helsinki-NLP/opus-mt-en-zh")
translation = pipeline("translation_en_to_zh", model=model, tokenizer=tokenizer)
#translation = pipeline("translation", model="Helsinki-NLP/opus-mt-en-zh")text = "hello"
result = translation(text, max_length=40)[0]["translation_text"]
result is εεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεεtransformers 4.31.0
I have also encountered this problem Have you solved it?