JieRuan commited on
Commit
840ff89
·
verified ·
1 Parent(s): e78a002

Update src/streamlit_app.py

Browse files
Files changed (1) hide show
  1. src/streamlit_app.py +3 -3
src/streamlit_app.py CHANGED
@@ -63,7 +63,7 @@ def load_data(path):
63
 
64
 
65
  # one page description
66
- st.markdown("## Leaderboard")
67
  # st.markdown("**Leaderboard:** higher scores shaded green; best models bolded.")
68
 
69
  tiers = ['F1', 'Accuracy']
@@ -142,7 +142,7 @@ pipeline_image = Image.open("src/pipeline.png")
142
  buffered2 = BytesIO()
143
  pipeline_image.save(buffered2, format="PNG")
144
  img_data_pipeline = base64.b64encode(buffered2.getvalue()).decode("utf-8")
145
- st.markdown("## Abstract")
146
  st.write(
147
  """
148
  The paper introduces ExpertLongBench, an expert-level benchmark containing 11 tasks from 9 domains that reflect realistic expert workflows and applications.
@@ -159,7 +159,7 @@ We benchmark 11 large language models (LLMs) and analyze components in CLEAR, sh
159
  )
160
 
161
 
162
- st.markdown("## Pipeline")
163
  st.markdown(
164
  f"""
165
  <div class="logo-container" style="display:flex; justify-content: center;">
 
63
 
64
 
65
  # one page description
66
+ st.markdown("## 🏆 Leaderboard")
67
  # st.markdown("**Leaderboard:** higher scores shaded green; best models bolded.")
68
 
69
  tiers = ['F1', 'Accuracy']
 
142
  buffered2 = BytesIO()
143
  pipeline_image.save(buffered2, format="PNG")
144
  img_data_pipeline = base64.b64encode(buffered2.getvalue()).decode("utf-8")
145
+ st.markdown("## 🧠 Abstract")
146
  st.write(
147
  """
148
  The paper introduces ExpertLongBench, an expert-level benchmark containing 11 tasks from 9 domains that reflect realistic expert workflows and applications.
 
159
  )
160
 
161
 
162
+ st.markdown("## 🧰 Evaluation Pipeline")
163
  st.markdown(
164
  f"""
165
  <div class="logo-container" style="display:flex; justify-content: center;">