PROBE

Running

mgyigit commited on 19 days ago

Commit

66b8efa

verified ·

1 Parent(s): da4dc20

Update src/about.py

Files changed (1) hide show

src/about.py CHANGED Viewed

@@ -39,9 +39,9 @@ LLM_BENCHMARKS_TEXT = f"""
       - This benchmark evaluates how well protein representation models can infer functional similarities between proteins. Ground truth functional similarities are derived from Gene Ontology (GO) annotations.
       - Different distance metrics (Cosine, Manhattan, Euclidean) are used to compute protein vector similarities, which are then correlated with the functional similarities.
       - The benchmark uses three different datasets:
-        • Sparse: A sparse uniform dataset with broader protein coverage
-        • 200: A set of well-annotated 200 proteins
-        • 500: A set of well-annotated 500 proteins
       - Metrics (sim_ prefix):
         • sim_sparse_MF_correlation/sim_200_MF_correlation/sim_500_MF_correlation: Correlation between protein embeddings and Molecular Function (MF) similarity scores

       - This benchmark evaluates how well protein representation models can infer functional similarities between proteins. Ground truth functional similarities are derived from Gene Ontology (GO) annotations.
       - Different distance metrics (Cosine, Manhattan, Euclidean) are used to compute protein vector similarities, which are then correlated with the functional similarities.
       - The benchmark uses three different datasets:
+        • Sparse Uniform: A sparse uniform dataset with broader protein coverage
+        • Well Annotated 200: A set of well-annotated 200 proteins
+        • Well Annotated 500: A set of well-annotated 500 proteins
       - Metrics (sim_ prefix):
         • sim_sparse_MF_correlation/sim_200_MF_correlation/sim_500_MF_correlation: Correlation between protein embeddings and Molecular Function (MF) similarity scores