Llama 3.2 400M Amharic

This is a smaller version of the Meta's Llama-3.2-1B decoder transformer model pretrained from scratch for 23 hours using a single A100 40GB GPU and 274 million tokens of Amharic text.

  • It has 400 Million parameters
  • The context size of this model is 1024 tokens.
  • It has the same tokenizer as Llama-3.2-1B, trained from scratch using the same Amharic dataset as the model with a vocabulary size of 32k.
  • Validation Perplexity: 41.3
  • This is a base model and hasn't undergone any supervised finetuing yet.

How to use

First, you need to install the latest version of transformers

pip install -Uq transformers

You can use this model directly with a pipeline for text generation:

from transformers import pipeline

llama_am = pipeline(
    "text-generation",
    model="rasyosef/Llama-3.2-400M-Amharic",
    device_map="auto"
  )

prompt = "አዲስ አበባ"
llama_am(
    prompt,
    max_new_tokens=128,
    temperature=0.5,
    do_sample=True,
    top_k=8,
    top_p=0.8,
    repetition_penalty=1.2
  )

Output:

[{'generated_text': 'አዲስ አበባ፣ ታህሳስ 8 ፣2012 (ኤፍ ቢ ሲ) የኢፌዴሪ የውጭ ጉዳይ ሚኒስትር አቶ ገዱ አንዳርጋቸው ከአፍሪካ ህብረት የስራ አስፈጻሚዎች ምክር ቤት መደበኛ ስብሰባ ጎን ለጎን ከዴሞክራቲክ ሪፐብሊክ ኮንጎ አቻቸው ማሪ ቱምባ ንዜዛ እና ከሌሎች የአፍሪካ አምባሳደሮች ጋር ተወያይተዋል።በውይይታቸውም በአፍሪካ የኮሮና ቫይረስን ለመከላከል እየተከናወኑ ባሉ ስራዎች ዙሪያ መምከራቸውን በትዊተር ገጻቸው አስፍረዋል።የሁለቱን ሀገራት ግንኙነት በተመለከተም፥ ኢትዮጵያ በህብረቱ ቋሚ አምባሳደርነት ባላት ሀላፊነት ለሹመት ማቅረብዋ የሚደነቅ መሆኑንም አንስተዋል።ኢትዮጵያ የኮቪድ19 ወረርሽኝን ለመግታት እያደረገች ባለው ጥረት ለደቡብ አፍሪካ ምስጋና አቅርባም ነበር፤ ቫይረሱን ለመቆጣጠር ከኢትዮጵያ ምን እንደምትማር በዝርዝር ላቀረብንላቸው ጥያቄም ወደፊት በሚሰሩ የትብብር መስኮች ላይ ተነጋግረን መስራት እንፈልጋለን ብለዋል።በቀጣይም ሁለቱ'}]
Downloads last month
47
Safetensors
Model size
413M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for rasyosef/Llama-3.2-400M-Amharic

Finetunes
2 models

Collection including rasyosef/Llama-3.2-400M-Amharic