File size: 1,558 Bytes
f12c203
 
 
e32d4aa
f12c203
 
 
e32d4aa
138f67f
 
f12c203
 
 
 
 
 
 
 
 
3aaa543
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
from transformers import pipeline

def test(text):    
    model_name = 'tarekziade/wikipedia-summaries-t5-efficient-tiny'
    summarizer = pipeline(
        "summarization",
        model=model_name,
        tokenizer='google/t5-efficient-tiny',
        clean_up_tokenization_spaces=True,
        model_kwargs={'cache_dir': './cache'}
    )        
    
    max_length = 500
    summary = summarizer(text, max_length=max_length, min_length=40, no_repeat_ngram_size=2)
    return summary

from pprint import pprint as pp


pp(test(''' We generally recommend a DeepNarrow strategy where the model’s depth is preferentially increased before considering any other forms of uniform scaling across other dimensions. This is largely due to how much depth influences the Pareto-frontier as shown in earlier sections of the paper. Specifically, a tall small (deep and narrow) model is generally more efficient compared to the base model. Likewise, a tall base model might also generally more efficient compared to a large model. We generally find that, regardless of size, even if absolute performance might increase as we continue to stack layers, the relative gain of Pareto-efficiency diminishes as we increase the layers, converging at 32 to 36 layers. Finally, we note that our notion of efficiency here relates to any one compute dimension, i.e., params, FLOPs or throughput (speed). We report all three key efficiency metrics (number of params, FLOPS and speed) and leave this decision to the practitioner to decide which compute dimension to consider.'''))