Update app.py
Browse files
app.py
CHANGED
@@ -6,12 +6,14 @@ api = gr.Interface.load("models/bigscience/bloom")
|
|
6 |
|
7 |
def complete_with_gpt(text):
|
8 |
# Use the last 50 characters of the text as context
|
9 |
-
return text[:-50] + api(text[-50:])
|
|
|
|
|
10 |
|
11 |
|
12 |
with gr.Blocks() as demo:
|
13 |
with gr.Row():
|
14 |
-
textbox = gr.Textbox(placeholder="Type here and press enter...", lines=
|
15 |
with gr.Column():
|
16 |
btn = gr.Button("Generate")
|
17 |
|
@@ -19,9 +21,10 @@ with gr.Blocks() as demo:
|
|
19 |
|
20 |
with gr.Row():
|
21 |
gr.Markdown("""
|
22 |
-
|
|
|
23 |
|
24 |
-
## Bloom Is Setting New Record for Most Performant and Efficient AI Model for Science Ever!
|
25 |
|
26 |
Bloom stands for:
|
27 |
B: Big Science
|
@@ -30,7 +33,7 @@ O: Open Science
|
|
30 |
O: Open Access
|
31 |
M: Multi Lingual Language Model
|
32 |
|
33 |
-
1. Video Playlist
|
34 |
2. Summary of Important Models and Sizes:
|
35 |
|
36 |
# Model Sizes to Date
|
@@ -54,8 +57,6 @@ DistilBERT|66 million
|
|
54 |
|
55 |
3. Background Information on ChatGPT, Bloom from BigScience on HuggingFace Platform, and RLHF DeepRL and One to Few Shot Learning and Generators:
|
56 |
|
57 |
-
|
58 |
-
|
59 |
# ChatGPT Datasets:
|
60 |
1. WebText
|
61 |
2. Common Crawl
|
@@ -64,43 +65,41 @@ DistilBERT|66 million
|
|
64 |
5. Toronto Books Corpus
|
65 |
6. OpenWebText
|
66 |
|
67 |
-
# Comparison to BigScience Model
|
68 |
|
69 |
-
|
|
|
|
|
70 |
|
71 |
-
|
72 |
-
|
73 |
-
# Model: https://huggingface.co/bigscience/bloom
|
74 |
|
75 |
# Papers:
|
76 |
-
1. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model https://arxiv.org/abs/2211.05100
|
77 |
-
2. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism https://arxiv.org/abs/1909.08053
|
78 |
-
3. 8-bit Optimizers via Block-wise Quantization https://arxiv.org/abs/2110.02861
|
79 |
-
4. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation https://arxiv.org/abs/2108.12409
|
80 |
-
5. https://huggingface.co/models?other=doi:10.57967/hf/0003
|
81 |
-
6. 217 Other Models optimizing use of bloom via specialization: https://huggingface.co/models?other=bloom
|
82 |
|
83 |
# Datasets
|
84 |
-
1. Universal Dependencies
|
85 |
-
2. WMT 2014
|
86 |
-
3. The Pile
|
87 |
-
4. HumanEval
|
88 |
-
5. FLORES-101
|
89 |
-
6. CrowS-Pairs
|
90 |
-
7. WikiLingua
|
91 |
-
8. MTEB
|
92 |
-
9. xP3
|
93 |
-
10. DiaBLa
|
94 |
|
95 |
# Deep RL ML Strategy
|
96 |
-
|
97 |
1. Language Model Preparation, Human Augmented with Supervised Fine Tuning
|
98 |
2. Reward Model Training with Prompts Dataset Multi-Model Generate Data to Rank
|
99 |
3. Fine Tuning with Reinforcement Reward and Distance Distribution Regret Score
|
100 |
4. Proximal Policy Optimization Fine Tuning
|
101 |
|
102 |
# Variations - Preference Model Pretraining
|
103 |
-
|
104 |
1. Use Ranking Datasets Sentiment - Thumbs Up/Down, Distribution
|
105 |
2. Online Version Getting Feedback
|
106 |
3. OpenAI - InstructGPT - Humans generate LM Training Text
|
|
|
6 |
|
7 |
def complete_with_gpt(text):
|
8 |
# Use the last 50 characters of the text as context
|
9 |
+
# return text[:-50] + api(text[-50:])
|
10 |
+
# Use the last 100 characters of the text as context
|
11 |
+
return text[:-100] + api(text[-100:])
|
12 |
|
13 |
|
14 |
with gr.Blocks() as demo:
|
15 |
with gr.Row():
|
16 |
+
textbox = gr.Textbox(placeholder="Type here and press enter...", lines=14)
|
17 |
with gr.Column():
|
18 |
btn = gr.Button("Generate")
|
19 |
|
|
|
21 |
|
22 |
with gr.Row():
|
23 |
gr.Markdown("""
|
24 |
+
|
25 |
+
# Big Science and Huggingface create 176 Billion Parameter Transformer Large Language Model
|
26 |
|
27 |
+
## Bloom Is Setting A New Record for Most Performant and Efficient AI Model for Science Ever!
|
28 |
|
29 |
Bloom stands for:
|
30 |
B: Big Science
|
|
|
33 |
O: Open Access
|
34 |
M: Multi Lingual Language Model
|
35 |
|
36 |
+
1. [Video Playlist](https://www.youtube.com/playlist?list=PLHgX2IExbFouqnsIqziThlPCX_miiDq14)
|
37 |
2. Summary of Important Models and Sizes:
|
38 |
|
39 |
# Model Sizes to Date
|
|
|
57 |
|
58 |
3. Background Information on ChatGPT, Bloom from BigScience on HuggingFace Platform, and RLHF DeepRL and One to Few Shot Learning and Generators:
|
59 |
|
|
|
|
|
60 |
# ChatGPT Datasets:
|
61 |
1. WebText
|
62 |
2. Common Crawl
|
|
|
65 |
5. Toronto Books Corpus
|
66 |
6. OpenWebText
|
67 |
|
68 |
+
# Comparison to BigScience Model - Big Science - How to get started
|
69 |
|
70 |
+
Big Science is a 176B parameter ML model trained on a set of datasets for Natural Language processing, and many other tasks that are not yet explored..
|
71 |
+
Below is the set of the papers, models, links, and datasets around big science which promises to be the best,
|
72 |
+
most recent large model of its kind benefitting all science pursuits.
|
73 |
|
74 |
+
# [Model](https://huggingface.co/bigscience/bloom)
|
|
|
|
|
75 |
|
76 |
# Papers:
|
77 |
+
1. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model [Paper](https://arxiv.org/abs/2211.05100)
|
78 |
+
2. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism [Paper](https://arxiv.org/abs/1909.08053)
|
79 |
+
3. 8-bit Optimizers via Block-wise Quantization [Paper](https://arxiv.org/abs/2110.02861)
|
80 |
+
4. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation [Paper](https://arxiv.org/abs/2108.12409)
|
81 |
+
5. [Paper](https://huggingface.co/models?other=doi:10.57967/hf/0003)
|
82 |
+
6. 217 Other Models optimizing use of bloom via specialization: [Paper](https://huggingface.co/models?other=bloom)
|
83 |
|
84 |
# Datasets
|
85 |
+
1. [Universal Dependencies](https://paperswithcode.com/dataset/universal-dependencies)
|
86 |
+
2. [WMT 2014](https://paperswithcode.com/dataset/wmt-2014)
|
87 |
+
3. [The Pile](https://paperswithcode.com/dataset/the-pile)
|
88 |
+
4. [HumanEval](https://paperswithcode.com/dataset/humaneval)
|
89 |
+
5. [FLORES-101](https://paperswithcode.com/dataset/flores-101)
|
90 |
+
6. [CrowS-Pairs](https://paperswithcode.com/dataset/crows-pairs)
|
91 |
+
7. [WikiLingua](https://paperswithcode.com/dataset/wikilingua)
|
92 |
+
8. [MTEB](https://paperswithcode.com/dataset/mteb)
|
93 |
+
9. [xP3](https://paperswithcode.com/dataset/xp3)
|
94 |
+
10. [DiaBLa](https://paperswithcode.com/dataset/diabla)
|
95 |
|
96 |
# Deep RL ML Strategy
|
|
|
97 |
1. Language Model Preparation, Human Augmented with Supervised Fine Tuning
|
98 |
2. Reward Model Training with Prompts Dataset Multi-Model Generate Data to Rank
|
99 |
3. Fine Tuning with Reinforcement Reward and Distance Distribution Regret Score
|
100 |
4. Proximal Policy Optimization Fine Tuning
|
101 |
|
102 |
# Variations - Preference Model Pretraining
|
|
|
103 |
1. Use Ranking Datasets Sentiment - Thumbs Up/Down, Distribution
|
104 |
2. Online Version Getting Feedback
|
105 |
3. OpenAI - InstructGPT - Humans generate LM Training Text
|