Spaces:

awacke1
/

Bloom.Big.Science.Continual.Generator

Sleeping

App Files Files Community

awacke1 commited on Feb 19, 2023

Commit

d6e6796

1 Parent(s): 3d89f71

Update app.py

Browse files

Files changed (1) hide show

app.py +53 -73

app.py CHANGED Viewed

@@ -22,84 +22,64 @@ with gr.Blocks() as demo:
     with gr.Row():
         gr.Markdown("""
-# Big Science and Huggingface create 176 Billion Parameter Transformer Large Language Model
-## Bloom Is Setting A New Record for Most Performant and Efficient AI Model for Science Ever!
-Bloom stands for:
-B: Big Science
-L: Large Language Model
-O: Open Science
-O: Open Access
-M: Multi Lingual Language Model
-1. [Video Playlist](https://www.youtube.com/playlist?list=PLHgX2IExbFouqnsIqziThlPCX_miiDq14)
-2. Summary of Important Models and Sizes:
-# Model Sizes to Date
-Model Name	|	Model Size (in Parameters)
-----------------|---------------------------------
-BigScience-tr11-176B|176 billion
-GPT-3|175 billion
-OpenAI's DALL-E 2.0|500 million
-NVIDIA's Megatron|8.3 billion
-Google's BERT|340 million
-GPT-2|1.5 billion
-OpenAI's GPT-1|117 million
-ELMo|90 million
-ULMFiT|100 million
-Transformer-XL|250 million
-XLNet|210 million
-RoBERTa|125 million
-ALBERT|12 million
-DistilBERT|66 million
-3. Background Information on ChatGPT, Bloom from BigScience on HuggingFace Platform, and RLHF DeepRL and One to Few Shot Learning and Generators:
-# ChatGPT Datasets:
-1. WebText
-2. Common Crawl
-3. BooksCorpus
-4. English Wikipedia
-5. Toronto Books Corpus
-6. OpenWebText
-# Comparison to BigScience Model - Big Science - How to get started
-Big Science is a 176B parameter ML model trained on a set of datasets for Natural Language processing, and many other tasks that are not yet explored..
-Below is the set of the papers, models, links, and datasets around big science which promises to be the best,
-most recent large model of its kind benefitting all science pursuits.
-# [Model](https://huggingface.co/bigscience/bloom)
-# Papers:
-1. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model [Paper](https://arxiv.org/abs/2211.05100)
-2. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism [Paper](https://arxiv.org/abs/1909.08053)
-3. 8-bit Optimizers via Block-wise Quantization [Paper](https://arxiv.org/abs/2110.02861)
-4. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation [Paper](https://arxiv.org/abs/2108.12409)
-5. [Paper](https://huggingface.co/models?other=doi:10.57967/hf/0003)
-6. 217 Other Models optimizing use of bloom via specialization: [Paper](https://huggingface.co/models?other=bloom)
-# Datasets
-1. [Universal Dependencies](https://paperswithcode.com/dataset/universal-dependencies)
-2. [WMT 2014](https://paperswithcode.com/dataset/wmt-2014)
-3. [The Pile](https://paperswithcode.com/dataset/the-pile)
-4. [HumanEval](https://paperswithcode.com/dataset/humaneval)
-5. [FLORES-101](https://paperswithcode.com/dataset/flores-101)
-6. [CrowS-Pairs](https://paperswithcode.com/dataset/crows-pairs)
-7. [WikiLingua](https://paperswithcode.com/dataset/wikilingua)
-8. [MTEB](https://paperswithcode.com/dataset/mteb)
-9. [xP3](https://paperswithcode.com/dataset/xp3)
-10. [DiaBLa](https://paperswithcode.com/dataset/diabla)
-# Deep RL ML Strategy
 1. Language Model Preparation, Human Augmented with Supervised Fine Tuning
 2. Reward Model Training with Prompts Dataset Multi-Model Generate Data to Rank
 3. Fine Tuning with Reinforcement Reward and Distance Distribution Regret Score
 4. Proximal Policy Optimization Fine Tuning
-# Variations - Preference Model Pretraining
 1. Use Ranking Datasets Sentiment - Thumbs Up/Down, Distribution
 2. Online Version Getting Feedback
 3. OpenAI - InstructGPT - Humans generate LM Training Text

     with gr.Row():
         gr.Markdown("""
+# Outline of Exciting AI Developments! 🤖💻🔬
+Here is an outline of some of the most exciting recent developments in AI:
+## Language Models 🗣️
+🏆 Bloom sets new record for most performant and efficient AI model in science! 🌸
+### Comparison of Large Language Models
+| Model Name        | Model Size (in Parameters) |
+| ----------------- | -------------------------- |
+| BigScience-tr11-176B | 176 billion |
+| GPT-3             | 175 billion               |
+| OpenAI's DALL-E 2.0 | 500 million               |
+| NVIDIA's Megatron | 8.3 billion               |
+| Transformer-XL    | 250 million               |
+| XLNet             | 210 million               |
+## ChatGPT Datasets 📚
+- WebText
+- Common Crawl
+- BooksCorpus
+- English Wikipedia
+- Toronto Books Corpus
+- OpenWebText
+## Big Science Model 🚀
+- 📜 Papers:
+  1. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model [Paper](https://arxiv.org/abs/2211.05100)
+  2. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism [Paper](https://arxiv.org/abs/1909.08053)
+  3. 8-bit Optimizers via Block-wise Quantization [Paper](https://arxiv.org/abs/2110.02861)
+  4. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation [Paper](https://arxiv.org/abs/2108.12409)
+  5. [Other papers related to Big Science](https://huggingface.co/models?other=doi:10.57967/hf/0003)
+  6. [217 other models optimized for use with Bloom](https://huggingface.co/models?other=bloom)
+- 📚 Datasets:
+  1. [Universal Dependencies](https://paperswithcode.com/dataset/universal-dependencies)
+  2. [WMT 2014](https://paperswithcode.com/dataset/wmt-2014)
+  3. [The Pile](https://paperswithcode.com/dataset/the-pile)
+  4. [HumanEval](https://paperswithcode.com/dataset/humaneval)
+  5. [FLORES-101](https://paperswithcode.com/dataset/flores-101)
+  6. [CrowS-Pairs](https://paperswithcode.com/dataset/crows-pairs)
+  7. [WikiLingua](https://paperswithcode.com/dataset/wikilingua)
+  8. [MTEB](https://paperswithcode.com/dataset/mteb)
+  9. [xP3](https://paperswithcode.com/dataset/xp3)
+  10. [DiaBLa](https://paperswithcode.com/dataset/diabla)
+## Deep RL ML Strategy 🧠
 1. Language Model Preparation, Human Augmented with Supervised Fine Tuning
 2. Reward Model Training with Prompts Dataset Multi-Model Generate Data to Rank
 3. Fine Tuning with Reinforcement Reward and Distance Distribution Regret Score
 4. Proximal Policy Optimization Fine Tuning
+## Variations - Preference Model Pretraining 🤔
 1. Use Ranking Datasets Sentiment - Thumbs Up/Down, Distribution
 2. Online Version Getting Feedback
 3. OpenAI - InstructGPT - Humans generate LM Training Text