1 1 3

Fangyu Lei

FangyuLei

AI & ML interests

None yet

Recent Activity

authored a paper 1 day ago

S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models

authored a paper 1 day ago

Competition-Level Problems are Effective LLM Evaluators

authored a paper 1 day ago

MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models

View all activity

Organizations

FangyuLei's activity

authored 7 papers 1 day ago

S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models

Paper • 2310.15147 • Published Oct 23, 2023 • 2

Competition-Level Problems are Effective LLM Evaluators

Paper • 2312.02143 • Published Dec 4, 2023

MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models

Paper • 2402.12851 • Published Feb 20, 2024 • 2

Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent

Paper • 2402.13717 • Published Feb 21, 2024 • 2

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11, 2024 • 48

Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

Paper • 2407.10956 • Published Jul 15, 2024 • 7

Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

Paper • 2411.07763 • Published Nov 12, 2024

updated a dataset 4 months ago

xlangai/spider2-lite

Viewer • Updated Nov 13, 2024 • 260 • 142 • 12

liked a dataset 4 months ago

xlangai/spider2-lite

Viewer • Updated Nov 13, 2024 • 260 • 142 • 12

authored a paper 5 months ago

DA-Code: Agent Data Science Code Generation Benchmark for Large Language Models

Paper • 2410.07331 • Published Oct 9, 2024 • 5

updated a dataset 6 months ago

xlangai/spider2_localdb

Updated Sep 19, 2024

upvoted a paper 11 months ago

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Paper • 2404.07972 • Published Apr 11, 2024 • 48

updated a dataset about 1 year ago

FangyuLei/s3eval

Viewer • Updated Jan 19, 2024 • 1.17k • 20

updated a collection about 1 year ago

S3Eval

Collection

S3Eval: A Synthetic, Scalable and Systematic Evaluation Suite for Large Language Models • 0 items • Updated Jan 19, 2024

updated 2 datasets over 1 year ago

S3Eval/Easy

Updated Oct 12, 2023 • 5

S3Eval/General

Updated Oct 12, 2023 • 39

New activity in Qwen/Qwen-14B over 1 year ago

kernels

#3 opened over 1 year ago by

fengcai0824

liked a model over 1 year ago

bigcode/starcoderbase-1b

Text Generation • Updated Sep 14, 2023 • 13k • • 71

updated a dataset over 1 year ago

TableQAKit/HybridQA_read

Preview • Updated Aug 6, 2023 • 14