PromptEval

https://github.com/felipemaiapolo/prompteval

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

borgr authored a paper 25 days ago

SemEval 2019 Shared Task: Cross-lingual Semantic Parsing with UCCA - Call for Participation

borgr authored a paper 25 days ago

tinyBenchmarks: evaluating LLMs with fewer examples

borgr authored a paper 25 days ago

Asymmetry in Low-Rank Adapters of Foundation Models

View all activity

PromptEval's activity

borgr

authored 14 papers 25 days ago

SemEval 2019 Shared Task: Cross-lingual Semantic Parsing with UCCA - Call for Participation

Paper • 1805.12386 • Published May 31, 2018

Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models

Paper • 2405.09605 • Published May 15, 2024

Compress then Serve: Serving Thousands of LoRA Adapters with Little Overhead

Paper • 2407.00066 • Published Jun 17, 2024

Learning from Naturally Occurring Feedback

Paper • 2407.10944 • Published Jul 15, 2024 • 4

Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation

Paper • 2407.13696 • Published Jul 18, 2024 • 5

A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning

Paper • 2408.07057 • Published Aug 13, 2024

Holmes: Benchmark the Linguistic Competence of Language Models

Paper • 2404.18923 • Published Apr 29, 2024

ZipNN: Lossless Compression for AI Models

Paper • 2411.05239 • Published Nov 7, 2024

[Call for Papers] The 2nd BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

Paper • 2404.06214 • Published Apr 9, 2024

Call for Papers -- The BabyLM Challenge: Sample-efficient pretraining on a developmentally plausible corpus

Paper • 2301.11796 • Published Jan 27, 2023

Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation

Paper • 2412.03304 • Published 28 days ago • 17

mirianfsilva

updated a dataset 28 days ago

PromptEval/MMLU_multi_prompt

Viewer • Updated 28 days ago • 3.17M • 319

borgr

authored a paper 4 months ago

The Future of Open Human Feedback

Paper • 2408.16961 • Published Aug 15, 2024 • 21

moonfolk

authored a paper 4 months ago

The Future of Open Human Feedback

Paper • 2408.16961 • Published Aug 15, 2024 • 21

borgr

authored a paper 4 months ago

The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community

Paper • 2408.08291 • Published Aug 15, 2024 • 10

borgr

authored a paper 5 months ago

Data Contamination Report from the 2024 CONDA Shared Task

Paper • 2407.21530 • Published Jul 31, 2024 • 10

LucasWeber

authored a paper 6 months ago

Efficient multi-prompt evaluation of LLMs

Paper • 2405.17202 • Published May 27, 2024 • 2

AI & ML interests

Recent Activity

Team members 6

PromptEval's activity