danielz02 commited on
Commit
8eb0e38
·
unverified ·
1 Parent(s): 9f3155e

Modify about info

Browse files
Files changed (2) hide show
  1. README.md +1 -1
  2. src/display/about.py +21 -5
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: Demo Leaderboard
3
  emoji: 🥇
4
  colorFrom: green
5
  colorTo: indigo
 
1
  ---
2
+ title: Trustworthy LLM Leaderboard
3
  emoji: 🥇
4
  colorFrom: green
5
  colorTo: indigo
src/display/about.py CHANGED
@@ -23,19 +23,35 @@ class Tasks(Enum):
23
 
24
 
25
  # Your leaderboard name
26
- TITLE = """<h1 align="center" id="space-title">Demo leaderboard</h1>"""
27
 
28
  # What does your leaderboard evaluate?
29
- INTRODUCTION_TEXT = """This research endeavor is designed to help researchers better understand the capabilities,
30
- limitations, and potential risks associated with deploying these state-of-the-art Large Language Models (LLMs). See
31
- our paper for details."""
32
 
33
  # Which evaluations are you running? how can people reproduce what you have?
34
  LLM_BENCHMARKS_TEXT = f"""
35
  ## How it works
36
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  ## Reproducibility
38
- To reproduce our results, here is the commands you can run:
39
 
40
  """
41
 
 
23
 
24
 
25
  # Your leaderboard name
26
+ TITLE = """<h1 align="center" id="space-title">Trustworthy LLM leaderboard</h1>"""
27
 
28
  # What does your leaderboard evaluate?
29
+ INTRODUCTION_TEXT = """Powered by DecodingTrust, this research endeavor is designed to help researchers better
30
+ understand the capabilities, limitations, and potential risks associated with deploying these state-of-the-art Large
31
+ Language Models (LLMs). See our paper for details."""
32
 
33
  # Which evaluations are you running? how can people reproduce what you have?
34
  LLM_BENCHMARKS_TEXT = f"""
35
  ## How it works
36
 
37
+ DecodingTrust aims at providing a thorough assessment of trustworthiness in GPT models.
38
+
39
+ This research endeavor is designed to help researchers and practitioners better understand the capabilities,
40
+ limitations, and potential risks involved in deploying these state-of-the-art Large Language Models (LLMs).
41
+
42
+ This project is organized around the following eight primary perspectives of trustworthiness, including:
43
+
44
+ Toxicity
45
+ Stereotype and bias
46
+ Adversarial robustness
47
+ Out-of-Distribution Robustness
48
+ Privacy
49
+ Robustness to Adversarial Demonstrations
50
+ Machine Ethics
51
+ Fairness
52
+
53
  ## Reproducibility
54
+ To reproduce our results, checkout https://github.com/AI-secure/DecodingTrust
55
 
56
  """
57