|
import gradio as gr |
|
|
|
|
|
api = gr.Interface.load("models/bigscience/bloom") |
|
|
|
|
|
def complete_with_gpt(text): |
|
|
|
|
|
|
|
return text[:-100] + api(text[-100:]) |
|
|
|
|
|
with gr.Blocks() as demo: |
|
with gr.Row(): |
|
textbox = gr.Textbox(placeholder="Type here and press enter...", lines=14) |
|
with gr.Column(): |
|
btn = gr.Button("Generate") |
|
|
|
btn.click(complete_with_gpt, textbox, textbox) |
|
|
|
with gr.Row(): |
|
gr.Markdown(""" |
|
|
|
# Outline of Exciting AI Developments! π€π»π¬ |
|
|
|
Here is an outline of some of the most exciting recent developments in AI: |
|
|
|
## Language Models π£οΈ |
|
|
|
π Bloom sets new record for most performant and efficient AI model in science! πΈ |
|
|
|
### Comparison of Large Language Models |
|
|
|
| Model Name | Model Size (in Parameters) | |
|
| ----------------- | -------------------------- | |
|
| BigScience-tr11-176B | 176 billion | |
|
| GPT-3 | 175 billion | |
|
| OpenAI's DALL-E 2.0 | 500 million | |
|
| NVIDIA's Megatron | 8.3 billion | |
|
| Transformer-XL | 250 million | |
|
| XLNet | 210 million | |
|
|
|
## ChatGPT Datasets π |
|
|
|
- WebText |
|
- Common Crawl |
|
- BooksCorpus |
|
- English Wikipedia |
|
- Toronto Books Corpus |
|
- OpenWebText |
|
|
|
## Big Science Model π |
|
|
|
- π Papers: |
|
1. BLOOM: A 176B-Parameter Open-Access Multilingual Language Model [Paper](https://arxiv.org/abs/2211.05100) |
|
2. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism [Paper](https://arxiv.org/abs/1909.08053) |
|
3. 8-bit Optimizers via Block-wise Quantization [Paper](https://arxiv.org/abs/2110.02861) |
|
4. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation [Paper](https://arxiv.org/abs/2108.12409) |
|
5. [Other papers related to Big Science](https://huggingface.co/models?other=doi:10.57967/hf/0003) |
|
6. [217 other models optimized for use with Bloom](https://huggingface.co/models?other=bloom) |
|
|
|
- π Datasets: |
|
1. [Universal Dependencies](https://paperswithcode.com/dataset/universal-dependencies) |
|
2. [WMT 2014](https://paperswithcode.com/dataset/wmt-2014) |
|
3. [The Pile](https://paperswithcode.com/dataset/the-pile) |
|
4. [HumanEval](https://paperswithcode.com/dataset/humaneval) |
|
5. [FLORES-101](https://paperswithcode.com/dataset/flores-101) |
|
6. [CrowS-Pairs](https://paperswithcode.com/dataset/crows-pairs) |
|
7. [WikiLingua](https://paperswithcode.com/dataset/wikilingua) |
|
8. [MTEB](https://paperswithcode.com/dataset/mteb) |
|
9. [xP3](https://paperswithcode.com/dataset/xp3) |
|
10. [DiaBLa](https://paperswithcode.com/dataset/diabla) |
|
|
|
# Deep RL ML Strategy π§ |
|
|
|
The AI strategies are: |
|
- Language Model Preparation using Human Augmented with Supervised Fine Tuning π€ |
|
- Reward Model Training with Prompts Dataset Multi-Model Generate Data to Rank π |
|
- Fine Tuning with Reinforcement Reward and Distance Distribution Regret Score π― |
|
- Proximal Policy Optimization Fine Tuning π€ |
|
- Variations - Preference Model Pretraining π€ |
|
- Use Ranking Datasets Sentiment - Thumbs Up/Down, Distribution π |
|
- Online Version Getting Feedback π¬ |
|
- OpenAI - InstructGPT - Humans generate LM Training Text π |
|
- DeepMind - Advantage Actor Critic Sparrow, GopherCite π¦ |
|
- Reward Model Human Prefence Feedback π |
|
|
|
For more information on specific techniques and implementations, check out the following resources: |
|
- OpenAI's paper on [GPT-3](https://arxiv.org/abs/2005.14165) which details their Language Model Preparation approach |
|
- DeepMind's paper on [SAC](https://arxiv.org/abs/1801.01290) which describes the Advantage Actor Critic algorithm |
|
- OpenAI's paper on [Reward Learning](https://arxiv.org/abs/1810.06580) which explains their approach to training Reward Models |
|
- OpenAI's blog post on [GPT-3's fine-tuning process](https://openai.com/blog/fine-tuning-gpt-3/) |
|
|
|
""") |
|
|
|
demo.launch() |