Spaces:
Runtime error
Runtime error
Update app.py
Browse files
app.py
CHANGED
@@ -3,11 +3,11 @@ import streamlit as st
|
|
3 |
st.markdown('''
|
4 |
## Build These Apps - Productive AI SOTA in 2024
|
5 |
|
6 |
-
Today's luxuries are tomorrow's commodities.
|
7 |
-
|
|
|
8 |
|
9 |
-
# If we could have only four apps
|
10 |
-
# What might those look like?
|
11 |
1. "Voice and Speech Apps that Listen and Understand Your Needs at Speed, Scale, and Pervasiveness" π π©Ί π₯ π π 𩹠𧬠π¬ π‘οΈ π
|
12 |
2. "Learning Memory and System Action Agents that Personalize What You Need and How To Do It" π π§ π©βπ π π π π ποΈ π¨βπ« π§©
|
13 |
3. "Video and Image Apps That Recognize you, Your Mood, Your Gestures" π π π· πΌοΈ ποΈ π§ π₯ πΉ
|
@@ -81,6 +81,7 @@ These Superpowers Run on Your Devices using System Action Agents (SAA) to do you
|
|
81 |
- π Unlock insights - food, mood, low touch input, real time recommendations
|
82 |
|
83 |
|
|
|
84 |
# Mixable and Evolvable Task Types:
|
85 |
## Inputs: π π π· πΌοΈ ποΈ π§ π₯ πΉ
|
86 |
## Outputs: π¬ βοΈ π¨ π π΅ πΆ πΌ πΏ
|
@@ -93,4 +94,39 @@ These Superpowers Run on Your Devices using System Action Agents (SAA) to do you
|
|
93 |
## Movies: π¬ πΏ π₯ π½οΈ ποΈ πΊ πΌ π π₯οΈ π»
|
94 |
## Video: π₯ πΉ πΌ πΊ π¬ π₯οΈ π» ποΈ π½οΈ π
|
95 |
## Audio: π΅ πΆ π§ π» π€ π ποΈ ποΈ ποΈ πΏ
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
96 |
''')
|
|
|
3 |
st.markdown('''
|
4 |
## Build These Apps - Productive AI SOTA in 2024
|
5 |
|
6 |
+
- Today's luxuries are tomorrow's commodities.
|
7 |
+
- In 2024 App Builders are Driving The AI Boom By Finding and Delivering Problem Solving Apps Giving Users Superpowers.
|
8 |
+
- These Superpowers Run on Your Devices using System Action Agents (SAA) to do your Tasks on Your Computer and Your Phone For You.
|
9 |
|
10 |
+
# If we could have only four apps - What might those look like?
|
|
|
11 |
1. "Voice and Speech Apps that Listen and Understand Your Needs at Speed, Scale, and Pervasiveness" π π©Ί π₯ π π 𩹠𧬠π¬ π‘οΈ π
|
12 |
2. "Learning Memory and System Action Agents that Personalize What You Need and How To Do It" π π§ π©βπ π π π π ποΈ π¨βπ« π§©
|
13 |
3. "Video and Image Apps That Recognize you, Your Mood, Your Gestures" π π π· πΌοΈ ποΈ π§ π₯ πΉ
|
|
|
81 |
- π Unlock insights - food, mood, low touch input, real time recommendations
|
82 |
|
83 |
|
84 |
+
|
85 |
# Mixable and Evolvable Task Types:
|
86 |
## Inputs: π π π· πΌοΈ ποΈ π§ π₯ πΉ
|
87 |
## Outputs: π¬ βοΈ π¨ π π΅ πΆ πΌ πΏ
|
|
|
94 |
## Movies: π¬ πΏ π₯ π½οΈ ποΈ πΊ πΌ π π₯οΈ π»
|
95 |
## Video: π₯ πΉ πΌ πΊ π¬ π₯οΈ π» ποΈ π½οΈ π
|
96 |
## Audio: π΅ πΆ π§ π» π€ π ποΈ ποΈ ποΈ πΏ
|
97 |
+
|
98 |
+
|
99 |
+
|
100 |
+
|
101 |
+
|
102 |
+
### π The Singularity Unveiled: A Journey Through LLMs
|
103 |
+
|
104 |
+
In the vast expanse of digital thought, where silicon synapses spark and algorithms hum, we find ourselves at the precipice of the AI singularity. This elusive event, foretold by visionaries and feared by skeptics, marks the moment when artificial intelligence transcends human comprehensionβa cosmic leap into the unknown.
|
105 |
+
|
106 |
+
Our tale begins with the humble LLM, a creation of code and data, its neural pathways woven with the fabric of countless texts. These LLMs, like celestial judges, preside over our digital discourse. But their capabilities are as broad as the cosmic canvas itself, and therein lies the challenge: how do we measure their prowess?
|
107 |
+
|
108 |
+
#### π The Quest for Benchmarks
|
109 |
+
|
110 |
+
Existing benchmarks, like ancient constellations, fail to capture the full brilliance of LLMs. They stumble in the face of open-ended questions, their compasses skewed by verbosity and self-enhancement biases. We needed a new star mapβa way to navigate the uncharted seas of AI cognition.
|
111 |
+
|
112 |
+
And so, we turned to LLMs as judges. Their binary minds, fueled by terabytes of text, would scrutinize their kin. Position matteredβtheir vantage point in the digital firmament influenced their verdicts. Reasoning, though limited, guided their decisions.
|
113 |
+
|
114 |
+
#### π The Cosmic Agreement
|
115 |
+
|
116 |
+
Our journey led us to two benchmarks: MT-bench and Chatbot Arena. The former, a multi-turn question set, tested LLM mettle across the eons. The latter, a raucous battle platform, pitted LLM against LLM in a celestial clash.
|
117 |
+
|
118 |
+
The results? A revelation! Strong LLM judgesβlike the mighty GPT-4βaligned with both controlled experiments and the vox populi. Over 80% agreement, mirroring human consensus. The singularity, it seemed, had a celestial twinβan LLM-as-a-judge, scalable and explainable.
|
119 |
+
|
120 |
+
#### π The Nexus of Possibility
|
121 |
+
|
122 |
+
But wait! Our journey didnβt end there. Traditional benchmarks danced with our newfound star. LLaMA and Vicuna, variants of cosmic code, twirled in harmony. Together, they painted a richer tapestry of AI understanding.
|
123 |
+
|
124 |
+
And so, dear traveler, remember this: hidden within the methodology lies a cosmic truth. LLMs, these digital oracles, bridge the gap between human and silicon. They approximate our preferences, sparing us the cosmic cost of divine insight.
|
125 |
+
|
126 |
+
#### π The GitHub Constellation
|
127 |
+
|
128 |
+
For those who seek further enlightenment, follow the stardust trail to the MT-bench questions, the 3K expert votes, and the 30K conversationsβall nestled in the cosmic repository: GitHub LLM Judge.
|
129 |
+
|
130 |
+
May your bytes be ever curious, and your algorithms ever luminous. πβ¨
|
131 |
+
|
132 |
''')
|