CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models Paper • 2502.16614 • Published 18 days ago • 24
Benchmarking Temporal Reasoning and Alignment Across Chinese Dynasties Paper • 2502.16922 • Published 17 days ago • 7
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper • 2501.09751 • Published Jan 16 • 47
ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning Paper • 2501.06590 • Published Jan 11 • 10
Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment Paper • 2403.02738 • Published Mar 5, 2024
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation Paper • 2412.13649 • Published Dec 18, 2024 • 20
DINER: Debiasing Aspect-based Sentiment Analysis with Multi-variable Causal Inference Paper • 2403.01166 • Published Mar 2, 2024
SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding Paper • 2406.18200 • Published Jun 26, 2024 • 1
AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment Paper • 2407.01965 • Published Jul 2, 2024
STAR: Constraint LoRA with Dynamic Active Learning for Data-Efficient Fine-Tuning of Large Language Models Paper • 2403.01165 • Published Mar 2, 2024
A Comparative Study on Reasoning Patterns of OpenAI's o1 Model Paper • 2410.13639 • Published Oct 17, 2024 • 17
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment Paper • 2410.13785 • Published Oct 17, 2024 • 19
PositionID: LLMs can Control Lengths, Copy and Paste with Explicit Positional Awareness Paper • 2410.07035 • Published Oct 9, 2024 • 17
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper • 2409.16191 • Published Sep 24, 2024 • 42