Spaces:

bardsai
/

performance-llm-board

Running

piotr-szleg-bards-ai commited on Mar 8, 2024

Commit

446174f

1 Parent(s): 6d40f49

2024-03-08 15:33:44 Publish script update

Files changed (2) hide show

app.py CHANGED Viewed

@@ -234,8 +234,6 @@ Note that pause and resume time cost was not included in the "Cost Per Token" co
             """
         )
         general_plots[general_plots.plot_name == "execution_costs"].apply(display_filtered_plot, axis=1)
-    with gr.Tab("Summary metrics"):
-        summary_metrics_plots.apply(display_filtered_plot, axis=1)
     with gr.Tab("Context length and parameters count"):
         general_plots[general_plots.plot_name != "execution_costs"].apply(display_filtered_plot, axis=1)
         gr.Markdown(
@@ -247,7 +245,9 @@ A lot of models had to be omitted due to their developers not disclosing their p
 Mainly OpenAI's GPT models and Google's Palm 2.
 """
         )
-    with gr.Tab("Combined plots"):
         with gr.Row():
             choices = combined_plots.header
             choices = choices[choices.str.contains("for model")]
@@ -275,6 +275,8 @@ Mainly OpenAI's GPT models and Google's Palm 2.
 Radial plots are used to compare the most important aspects of each model researched on this board using single images.
 All values are normalized and scaled into 0.25 to 1 range, 0 is left for unknown values.
 To compare the parameters more thoroughly use the filtering box on top of this page and inspect individual tabs.
 """)

             """
         )
         general_plots[general_plots.plot_name == "execution_costs"].apply(display_filtered_plot, axis=1)
     with gr.Tab("Context length and parameters count"):
         general_plots[general_plots.plot_name != "execution_costs"].apply(display_filtered_plot, axis=1)
         gr.Markdown(
 Mainly OpenAI's GPT models and Google's Palm 2.
 """
         )
+    with gr.Tab("Summary quality metrics"):
+        summary_metrics_plots.apply(display_filtered_plot, axis=1)
+    with gr.Tab("Comprehensive models comparison"):
         with gr.Row():
             choices = combined_plots.header
             choices = choices[choices.str.contains("for model")]
 Radial plots are used to compare the most important aspects of each model researched on this board using single images.
 All values are normalized and scaled into 0.25 to 1 range, 0 is left for unknown values.
+Some metrics were reversed in order to make the plots more readable, for example "Fast execution" is `1 - execution_time` scaled to 0-1 range and moved 0.25 units up as mentioned above.
 To compare the parameters more thoroughly use the filtering box on top of this page and inspect individual tabs.
 """)

data/summary_metrics_plots.csv CHANGED Viewed

The diff for this file is too large to render. See raw diff