awacke1 commited on
Commit
d6e6796
Β·
1 Parent(s): 3d89f71

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +53 -73
app.py CHANGED
@@ -22,84 +22,64 @@ with gr.Blocks() as demo:
22
  with gr.Row():
23
  gr.Markdown("""
24
 
25
- # Big Science and Huggingface create 176 Billion Parameter Transformer Large Language Model
26
-
27
- ## Bloom Is Setting A New Record for Most Performant and Efficient AI Model for Science Ever!
28
-
29
- Bloom stands for:
30
- B: Big Science
31
- L: Large Language Model
32
- O: Open Science
33
- O: Open Access
34
- M: Multi Lingual Language Model
35
-
36
- 1. [Video Playlist](https://www.youtube.com/playlist?list=PLHgX2IExbFouqnsIqziThlPCX_miiDq14)
37
- 2. Summary of Important Models and Sizes:
38
-
39
- # Model Sizes to Date
40
-
41
- Model Name | Model Size (in Parameters)
42
- ----------------|---------------------------------
43
- BigScience-tr11-176B|176 billion
44
- GPT-3|175 billion
45
- OpenAI's DALL-E 2.0|500 million
46
- NVIDIA's Megatron|8.3 billion
47
- Google's BERT|340 million
48
- GPT-2|1.5 billion
49
- OpenAI's GPT-1|117 million
50
- ELMo|90 million
51
- ULMFiT|100 million
52
- Transformer-XL|250 million
53
- XLNet|210 million
54
- RoBERTa|125 million
55
- ALBERT|12 million
56
- DistilBERT|66 million
57
-
58
- 3. Background Information on ChatGPT, Bloom from BigScience on HuggingFace Platform, and RLHF DeepRL and One to Few Shot Learning and Generators:
59
-
60
- # ChatGPT Datasets:
61
- 1. WebText
62
- 2. Common Crawl
63
- 3. BooksCorpus
64
- 4. English Wikipedia
65
- 5. Toronto Books Corpus
66
- 6. OpenWebText
67
-
68
- # Comparison to BigScience Model - Big Science - How to get started
69
-
70
- Big Science is a 176B parameter ML model trained on a set of datasets for Natural Language processing, and many other tasks that are not yet explored..
71
- Below is the set of the papers, models, links, and datasets around big science which promises to be the best,
72
- most recent large model of its kind benefitting all science pursuits.
73
-
74
- # [Model](https://huggingface.co/bigscience/bloom)
75
-
76
- # Papers:
77
- 1. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model [Paper](https://arxiv.org/abs/2211.05100)
78
- 2. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism [Paper](https://arxiv.org/abs/1909.08053)
79
- 3. 8-bit Optimizers via Block-wise Quantization [Paper](https://arxiv.org/abs/2110.02861)
80
- 4. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation [Paper](https://arxiv.org/abs/2108.12409)
81
- 5. [Paper](https://huggingface.co/models?other=doi:10.57967/hf/0003)
82
- 6. 217 Other Models optimizing use of bloom via specialization: [Paper](https://huggingface.co/models?other=bloom)
83
-
84
- # Datasets
85
- 1. [Universal Dependencies](https://paperswithcode.com/dataset/universal-dependencies)
86
- 2. [WMT 2014](https://paperswithcode.com/dataset/wmt-2014)
87
- 3. [The Pile](https://paperswithcode.com/dataset/the-pile)
88
- 4. [HumanEval](https://paperswithcode.com/dataset/humaneval)
89
- 5. [FLORES-101](https://paperswithcode.com/dataset/flores-101)
90
- 6. [CrowS-Pairs](https://paperswithcode.com/dataset/crows-pairs)
91
- 7. [WikiLingua](https://paperswithcode.com/dataset/wikilingua)
92
- 8. [MTEB](https://paperswithcode.com/dataset/mteb)
93
- 9. [xP3](https://paperswithcode.com/dataset/xp3)
94
- 10. [DiaBLa](https://paperswithcode.com/dataset/diabla)
95
-
96
- # Deep RL ML Strategy
97
  1. Language Model Preparation, Human Augmented with Supervised Fine Tuning
98
  2. Reward Model Training with Prompts Dataset Multi-Model Generate Data to Rank
99
  3. Fine Tuning with Reinforcement Reward and Distance Distribution Regret Score
100
  4. Proximal Policy Optimization Fine Tuning
101
 
102
- # Variations - Preference Model Pretraining
103
  1. Use Ranking Datasets Sentiment - Thumbs Up/Down, Distribution
104
  2. Online Version Getting Feedback
105
  3. OpenAI - InstructGPT - Humans generate LM Training Text
 
22
  with gr.Row():
23
  gr.Markdown("""
24
 
25
+ # Outline of Exciting AI Developments! πŸ€–πŸ’»πŸ”¬
26
+
27
+ Here is an outline of some of the most exciting recent developments in AI:
28
+
29
+ ## Language Models πŸ—£οΈ
30
+
31
+ πŸ† Bloom sets new record for most performant and efficient AI model in science! 🌸
32
+
33
+ ### Comparison of Large Language Models
34
+
35
+ | Model Name | Model Size (in Parameters) |
36
+ | ----------------- | -------------------------- |
37
+ | BigScience-tr11-176B | 176 billion |
38
+ | GPT-3 | 175 billion |
39
+ | OpenAI's DALL-E 2.0 | 500 million |
40
+ | NVIDIA's Megatron | 8.3 billion |
41
+ | Transformer-XL | 250 million |
42
+ | XLNet | 210 million |
43
+
44
+ ## ChatGPT Datasets πŸ“š
45
+
46
+ - WebText
47
+ - Common Crawl
48
+ - BooksCorpus
49
+ - English Wikipedia
50
+ - Toronto Books Corpus
51
+ - OpenWebText
52
+
53
+ ## Big Science Model πŸš€
54
+
55
+ - πŸ“œ Papers:
56
+ 1. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model [Paper](https://arxiv.org/abs/2211.05100)
57
+ 2. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism [Paper](https://arxiv.org/abs/1909.08053)
58
+ 3. 8-bit Optimizers via Block-wise Quantization [Paper](https://arxiv.org/abs/2110.02861)
59
+ 4. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation [Paper](https://arxiv.org/abs/2108.12409)
60
+ 5. [Other papers related to Big Science](https://huggingface.co/models?other=doi:10.57967/hf/0003)
61
+ 6. [217 other models optimized for use with Bloom](https://huggingface.co/models?other=bloom)
62
+
63
+ - πŸ“š Datasets:
64
+ 1. [Universal Dependencies](https://paperswithcode.com/dataset/universal-dependencies)
65
+ 2. [WMT 2014](https://paperswithcode.com/dataset/wmt-2014)
66
+ 3. [The Pile](https://paperswithcode.com/dataset/the-pile)
67
+ 4. [HumanEval](https://paperswithcode.com/dataset/humaneval)
68
+ 5. [FLORES-101](https://paperswithcode.com/dataset/flores-101)
69
+ 6. [CrowS-Pairs](https://paperswithcode.com/dataset/crows-pairs)
70
+ 7. [WikiLingua](https://paperswithcode.com/dataset/wikilingua)
71
+ 8. [MTEB](https://paperswithcode.com/dataset/mteb)
72
+ 9. [xP3](https://paperswithcode.com/dataset/xp3)
73
+ 10. [DiaBLa](https://paperswithcode.com/dataset/diabla)
74
+
75
+ ## Deep RL ML Strategy 🧠
76
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
  1. Language Model Preparation, Human Augmented with Supervised Fine Tuning
78
  2. Reward Model Training with Prompts Dataset Multi-Model Generate Data to Rank
79
  3. Fine Tuning with Reinforcement Reward and Distance Distribution Regret Score
80
  4. Proximal Policy Optimization Fine Tuning
81
 
82
+ ## Variations - Preference Model Pretraining πŸ€”
83
  1. Use Ranking Datasets Sentiment - Thumbs Up/Down, Distribution
84
  2. Online Version Getting Feedback
85
  3. OpenAI - InstructGPT - Humans generate LM Training Text