zli12321 commited on
Commit
cef2f51
Β·
verified Β·
1 Parent(s): bc0bdd5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -9
README.md CHANGED
@@ -27,7 +27,6 @@ datasets:
27
  ## πŸŽ‰ Latest Updates
28
 
29
  - **Version 0.2.19 Released!**
30
- - Paper accepted to EMNLP 2024 Findings! πŸŽ“
31
  - Enhanced PEDANTS with multi-pipeline support and improved edge case handling
32
  - Added support for OpenAI GPT-series and Claude Series models (OpenAI version > 1.0)
33
  - Integrated support for open-source models (LLaMA-2-70B-chat, LLaVA-1.5, etc.) via [deepinfra](https://deepinfra.com/models)
@@ -286,14 +285,25 @@ Our fine-tuned models are available on Huggingface:
286
  ## πŸ“„ Citation
287
 
288
  ```bibtex
289
- @misc{li2024pedantspreciseevaluationsdiverse,
290
- title={PEDANTS: Cheap but Effective and Interpretable Answer Equivalence},
291
- author={Zongxia Li and Ishani Mondal and Yijun Liang and Huy Nghiem and Jordan Lee Boyd-Graber},
292
- year={2024},
293
- eprint={2402.11161},
294
- archivePrefix={arXiv},
295
- primaryClass={cs.CL},
296
- url={https://arxiv.org/abs/2402.11161},
 
 
 
 
 
 
 
 
 
 
 
297
  }
298
  ```
299
 
 
27
  ## πŸŽ‰ Latest Updates
28
 
29
  - **Version 0.2.19 Released!**
 
30
  - Enhanced PEDANTS with multi-pipeline support and improved edge case handling
31
  - Added support for OpenAI GPT-series and Claude Series models (OpenAI version > 1.0)
32
  - Integrated support for open-source models (LLaMA-2-70B-chat, LLaVA-1.5, etc.) via [deepinfra](https://deepinfra.com/models)
 
285
  ## πŸ“„ Citation
286
 
287
  ```bibtex
288
+ @inproceedings{li-etal-2024-pedants,
289
+ title = "{PEDANTS}: Cheap but Effective and Interpretable Answer Equivalence",
290
+ author = "Li, Zongxia and
291
+ Mondal, Ishani and
292
+ Nghiem, Huy and
293
+ Liang, Yijun and
294
+ Boyd-Graber, Jordan Lee",
295
+ editor = "Al-Onaizan, Yaser and
296
+ Bansal, Mohit and
297
+ Chen, Yun-Nung",
298
+ booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
299
+ month = nov,
300
+ year = "2024",
301
+ address = "Miami, Florida, USA",
302
+ publisher = "Association for Computational Linguistics",
303
+ url = "https://aclanthology.org/2024.findings-emnlp.548/",
304
+ doi = "10.18653/v1/2024.findings-emnlp.548",
305
+ pages = "9373--9398",
306
+ abstract = "Question answering (QA) can only make progress if we know if an answer is correct, but current answer correctness (AC) metrics struggle with verbose, free-form answers from large language models (LLMs). There are two challenges with current short-form QA evaluations: a lack of diverse styles of evaluation data and an over-reliance on expensive and slow LLMs. LLM-based scorers correlate better with humans, but this expensive task has only been tested on limited QA datasets. We rectify these issues by providing rubrics and datasets for evaluating machine QA adopted from the Trivia community. We also propose an efficient, and interpretable QA evaluation that is more stable than an exact match and neural methods (BERTScore)."
307
  }
308
  ```
309