Spaces:
Sleeping
Sleeping
title: Mean Reciprocal Rank | |
colorFrom: blue | |
colorTo: red | |
sdk: gradio | |
sdk_version: 3.0.2 | |
app_file: app.py | |
pinned: false | |
tags: | |
- evaluate | |
- metric | |
description: >- | |
Mean Reciprocal Rank is a statistic measure for evaluating any process that produces a list of possible responses to a sample of queries, ordered by probability of correctness. | |
# Metric Card for Mean Reciprocal Rank | |
a statistic measure for evaluating any process that produces a list of possible responses to a sample of queries, ordered by probability of correctness. | |
## Metric Description | |
The reciprocal rank of a query response is the multiplicative inverse of the rank of the first correct answer: 1 for first place, 1β2 for second place, 1β3 for third place and so on. The mean reciprocal rank is the average of the reciprocal ranks of results for a sample of queries Q | |
{\text{MRR}}={\frac {1}{|Q|}}\sum _{{i=1}}^{{|Q|}}{\frac {1}{{\text{rank}}_{i}}}.\! | |
## How to Use | |
Provide a list of gold ranks, where each item is rank of gold item of which the first rank starts with zero. | |
### Inputs | |
*List all input arguments in the format below* | |
- **input_field** *(List[int]): a list of integer where each integer is the rank of gold item | |
### Output Values | |
*Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}* | |
*State the range of possible values that the metric's output can take, as well as what in that range is considered good. For example: "This metric can take on any value between 0 and 100, inclusive. Higher scores are better."* | |
#### Values from Popular Papers | |
*Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.* | |
### Examples | |
*Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.* | |
## Limitations and Bias | |
*Note any known limitations or biases that the metric has, with links and references if possible.* | |
## Citation | |
*Cite the source where this metric was introduced.* | |
## Further References | |
*Add any useful further references.* | |