RajuKandasamy
/

ponniyinselvan_1.4b_alpha

Text Generation

text-generation-inference

Model card Files Files and versions Metrics Training metrics

RajuKandasamy commited on May 28, 2023

Commit

30a9e3a

·

1 Parent(s): f5f37bf

Create README.md

Files changed (1) hide show

README.md +68 -0

README.md ADDED Viewed

	@@ -0,0 +1,68 @@

+---
+license: apache-2.0
+language:
+- ta
+library_name: transformers
+pipeline_tag: text-generation
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+This model is trained on PonniyinSelvan tamil corpus dataset.
+## Model Details
+Base model used is EleutherAI's Pythia 1.4b
+### Model Description
+- **Finetuned from model [optional]:** Pythia 1.4b
+## Uses
+Purely education and research purposes only. Not fit for any kind of practical use.
+## Bias, Risks, and Limitations
+The base model Bias, Risks and Limitations apply
+## How to Get Started with the Model
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_path = "RajuKandasamy/ponniyinselvan_1.4b_alpha"
+device = "cuda" if torch.cuda.is_available() else "cpu"
+model = AutoModelForCausalLM.from_pretrained(model_path, load_in_8bit=False).to(device)
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+model.eval()
+prompt="""வந்தியத்தேவன்"""
+input_ids = tokenizer.encode(prompt, return_tensors="pt").to(model.device)
+attention_mask = torch.ones_like(input_ids).to(model.device) # set attention mask to 1 for all input tokens
+print("Thinking ...\n   ")
+with torch.no_grad():
+    output = model.generate(input_ids=input_ids, attention_mask=attention_mask, max_length=256, early_stopping=False, temperature=0.9, top_p=0.9,top_k=500, do_sample=True,output_scores=True,  pad_token_id=tokenizer.eos_token_id, repetition_penalty=1.2,eos_token_id=tokenizer.eos_token_id)
+output_str = tokenizer.decode(output[0], skip_special_tokens=False)
+print(output_str)
+```
+## Training Details
+10 epochs
+### Training Data
+ponniyinselvan text corpus
+### Training Procedure
+Casual Language Modelling, With custom BPE tokenizer