added some docs to readme
Browse files
README.md
CHANGED
@@ -17,17 +17,43 @@ pinned: false
|
|
17 |
***Module Card Instructions:*** *Fill out the following subsections. Feel free to take a look at existing metric cards if you'd like examples.*
|
18 |
|
19 |
## Metric Description
|
20 |
-
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
## How to Use
|
23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
25 |
-
*Provide simplest possible example for using the metric*
|
26 |
|
27 |
### Inputs
|
28 |
*List all input arguments in the format below*
|
29 |
-
- **
|
30 |
-
|
31 |
### Output Values
|
32 |
|
33 |
*Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}*
|
|
|
17 |
***Module Card Instructions:*** *Fill out the following subsections. Feel free to take a look at existing metric cards if you'd like examples.*
|
18 |
|
19 |
## Metric Description
|
20 |
+
This metric is used for evaluating how good a generated log(file) is, given a reference.
|
21 |
+
|
22 |
+
The metric measures two different aspects
|
23 |
+
|
24 |
+
1. It evaluates if the predicted log has the correct amount of timestamps, if timestamps are monotonically increasing and if the timestamps are consistent in their format.
|
25 |
+
2. For measuring the similarity in content (without timestamps), this metric uses sacrebleu.
|
26 |
|
27 |
## How to Use
|
28 |
+
The metric can be just by simply giving the predicted log and the reference log as string.
|
29 |
+
|
30 |
+
Example with timestamps that are of correct amount, consistent, monotonically increasing (-> timestamp score of 1.0):
|
31 |
+
```
|
32 |
+
>>> predictions = ["2024-01-12 11:23 hello, nice to meet you \n 2024-01-12 11:24 So we see each other again"]
|
33 |
+
>>> references = ["2024-02-14 This is a hello to you \n 2024-02-15 Another hello"]
|
34 |
+
logmetric = evaluate.load("svenwey/logscoremetric")
|
35 |
+
>>> results = logmetric.compute(predictions=predictions,
|
36 |
+
... references=references)
|
37 |
+
>>> print(results["timestamp_score"])
|
38 |
+
1.0
|
39 |
+
```
|
40 |
+
|
41 |
+
Example with timestamp missing from prediction:
|
42 |
+
```
|
43 |
+
>>> predictions = ["hello, nice to meet you"]
|
44 |
+
>>> references = ["2024-02-14 This is a hello to you"]
|
45 |
+
logmetric = evaluate.load("svenwey/logscoremetric")
|
46 |
+
>>> results = logmetric.compute(predictions=predictions,
|
47 |
+
... references=references)
|
48 |
+
>>> print(results["timestamp_score"])
|
49 |
+
0.0
|
50 |
+
```
|
51 |
|
|
|
52 |
|
53 |
### Inputs
|
54 |
*List all input arguments in the format below*
|
55 |
+
- **predictions** *(string list): The logs, as predicted/generated by the ML model. **Important: Every logfile is only one string, even if it contains multiple lines!***
|
56 |
+
- **references** *(string list): The reference logs (ground truth)*
|
57 |
### Output Values
|
58 |
|
59 |
*Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}*
|