Gerson Fabian Buenahora Ormaza commited on
Commit
2dfbb7d
·
verified ·
1 Parent(s): a1d6acc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -2
README.md CHANGED
@@ -9,6 +9,8 @@ base_model:
9
  pipeline_tag: text-generation
10
  ---
11
 
 
 
12
  # ST3: Simple Transformer 3
13
 
14
  ## Model description
@@ -22,7 +24,7 @@ ST3 (Simple Transformer 3) is a lightweight transformer-based model derived from
22
  - **Parameters:** 4 million FP32 parameters.
23
  - **Batch size:** 32.
24
  - **Training environment:** 1 epoch on a Kaggle P100 GPU.
25
- - **Tokenizer:** Custom WordPiece tokenizer "ST3" with a max input length of 2048 tokens.
26
 
27
  ## Intended use
28
  ST3 is not a highly powerful or fully functional model compared to larger transformer models but can be used for:
@@ -32,6 +34,32 @@ ST3 is not a highly powerful or fully functional model compared to larger transf
32
 
33
  This model has not been fine-tuned or evaluated with performance metrics as it’s not designed for state-of-the-art tasks.
34
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ## Limitations
36
  - **Performance:** ST3 lacks the power of larger models and may not perform well on complex language tasks.
37
  - **No evaluation:** The model hasn’t been benchmarked with metrics.
@@ -60,4 +88,3 @@ If you find this model useful and would like to support further development, ple
60
  ---
61
 
62
  *Contributions to this project are always welcome!*
63
-
 
9
  pipeline_tag: text-generation
10
  ---
11
 
12
+
13
+
14
  # ST3: Simple Transformer 3
15
 
16
  ## Model description
 
24
  - **Parameters:** 4 million FP32 parameters.
25
  - **Batch size:** 32.
26
  - **Training environment:** 1 epoch on a Kaggle P100 GPU.
27
+ - **Tokenizer:** Custom WordPiece tokenizer "ST3" that generates tokens with "##" as a prefix for subword units.
28
 
29
  ## Intended use
30
  ST3 is not a highly powerful or fully functional model compared to larger transformer models but can be used for:
 
34
 
35
  This model has not been fine-tuned or evaluated with performance metrics as it’s not designed for state-of-the-art tasks.
36
 
37
+ ### Usage
38
+ To use the ST3 model, you can follow this example:
39
+
40
+ ```python
41
+ from transformers import AutoModelForCausalLM, AutoTokenizer
42
+
43
+ tokenizer = AutoTokenizer.from_pretrained("BueormLLC/ST3")
44
+ model = AutoModelForCausalLM.from_pretrained("BueormLLC/ST3")
45
+
46
+ def clean_wordpiece_tokens(text):
47
+ return text.replace(" ##", "").replace("##", "")
48
+
49
+ input_text = "Esto es un ejemplo"
50
+ inputs = tokenizer(input_text, return_tensors="pt")
51
+
52
+ outputs = model.generate(inputs.input_ids, max_length=2048, num_return_sequences=1)
53
+
54
+ generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
55
+ cleaned_text = clean_wordpiece_tokens(generated_text)
56
+
57
+ print(cleaned_text)
58
+ ```
59
+
60
+ ### Explanation
61
+ The ST3 tokenizer uses the WordPiece algorithm, which generates tokens prefixed with "##" to indicate subword units. The provided `clean_wordpiece_tokens` function removes these prefixes, allowing for cleaner output text.
62
+
63
  ## Limitations
64
  - **Performance:** ST3 lacks the power of larger models and may not perform well on complex language tasks.
65
  - **No evaluation:** The model hasn’t been benchmarked with metrics.
 
88
  ---
89
 
90
  *Contributions to this project are always welcome!*