File size: 2,878 Bytes
61f94a0
0daf005
23dcd87
 
0daf005
 
 
 
61f94a0
0daf005
61f94a0
 
 
 
0daf005
 
 
 
 
 
 
 
 
 
 
6e27aba
728db8d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0daf005
6e27aba
 
 
 
0daf005
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
---
title: action_generation
datasets: 
- none
tags:
- evaluate
- metric
description: "TODO: add a description here"
sdk: gradio
sdk_version: 3.19.1
app_file: app.py
pinned: false
---

# Metric Card for action_generation

***Module Card Instructions:*** *Fill out the following subsections. Feel free to take a look at existing metric cards if you'd like examples.*

## Metric Description
*Give a brief overview of this metric, including what task(s) it is usually used for, if any.*

## How to Use
*Give general statement of how to use the metric*

*Provide simplest possible example for using the metric*

```python
import evaluate
valid_labels = [
    "/開箱",
    "/教學",
    "/表達",
    "/分享/外部資訊",
    "/分享/個人資訊",
    "/推薦/產品",
    "/推薦/服務",
    "/推薦/其他",
    ""
]
predictions = [
    ["/開箱/xxx", "/教學/yyy", "/表達/zzz"],
    ["/分享/外部資訊/aaa", "/教學/yyy", "/表達/zzz", "/分享/個人資訊/bbb"]
]
references = [
    ["/開箱/xxx", "/教學/yyy", "/表達/zzz"],
    ["/推薦/產品/bbb", "/教學/yyy", "/表達/zzz"]
]
metric = evaluate.load("DarrenChensformer/action_generation")
result = metric.compute(predictions=predictions, references=references, valid_labels=valid_labels, detailed_scores=True)
print(result)
```

```
{'class': {'precision': 0.7143, 'recall': 0.8333, 'f1': 0.7692}, 'phrase': {'precision': 0.8571, 'recall': 1.0, 'f1': 0.9231}, 'weighted_sum': {'precision': 0.7429, 'recall': 0.8666, 'f1': 0.8}}
```

### Inputs
*List all input arguments in the format below*
- **input_field** *(type): Definition of input, with explanation if necessary. State any default value(s).*

### Output Values

*Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}*

*State the range of possible values that the metric's output can take, as well as what in that range is considered good. For example: "This metric can take on any value between 0 and 100, inclusive. Higher scores are better."*

#### Values from Popular Papers
*Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.*

### Examples
*Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.*

## Limitations and Bias
*Note any known limitations or biases that the metric has, with links and references if possible.*

## Citation
*Cite the source where this metric was introduced.*

## Further References
*Add any useful further references.*