dhuynh95
's Collections
cool-papers
updated
Unlocking the conversion of Web Screenshots into HTML Code with the
WebSight Dataset
Paper
•
2403.09029
•
Published
•
55
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic
Prompt Compression
Paper
•
2403.12968
•
Published
•
25
RAFT: Adapting Language Model to Domain Specific RAG
Paper
•
2403.10131
•
Published
•
69
Quiet-STaR: Language Models Can Teach Themselves to Think Before
Speaking
Paper
•
2403.09629
•
Published
•
76
Simple and Scalable Strategies to Continually Pre-train Large Language
Models
Paper
•
2403.08763
•
Published
•
50
Language models scale reliably with over-training and on downstream
tasks
Paper
•
2403.08540
•
Published
•
15
Algorithmic progress in language models
Paper
•
2403.05812
•
Published
•
18
Gemini 1.5: Unlocking multimodal understanding across millions of tokens
of context
Paper
•
2403.05530
•
Published
•
62
TnT-LLM: Text Mining at Scale with Large Language Models
Paper
•
2403.12173
•
Published
•
20
Larimar: Large Language Models with Episodic Memory Control
Paper
•
2403.11901
•
Published
•
33
Reverse Training to Nurse the Reversal Curse
Paper
•
2403.13799
•
Published
•
13
When Do We Not Need Larger Vision Models?
Paper
•
2403.13043
•
Published
•
25
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual
Math Problems?
Paper
•
2403.14624
•
Published
•
52
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow
Instructions
Paper
•
2403.15246
•
Published
•
10
Can large language models explore in-context?
Paper
•
2403.15371
•
Published
•
32
The Unreasonable Ineffectiveness of the Deeper Layers
Paper
•
2403.17887
•
Published
•
79
Long-form factuality in large language models
Paper
•
2403.18802
•
Published
•
25
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Paper
•
2403.18421
•
Published
•
23
Octopus v2: On-device language model for super agent
Paper
•
2404.01744
•
Published
•
57
Poro 34B and the Blessing of Multilinguality
Paper
•
2404.01856
•
Published
•
13
Long-context LLMs Struggle with Long In-context Learning
Paper
•
2404.02060
•
Published
•
36
Training LLMs over Neurally Compressed Text
Paper
•
2404.03626
•
Published
•
22
CodeEditorBench: Evaluating Code Editing Capability of Large Language
Models
Paper
•
2404.03543
•
Published
•
16
Language Models as Compilers: Simulating Pseudocode Execution Improves
Algorithmic Reasoning in Language Models
Paper
•
2404.02575
•
Published
•
48
Toward Self-Improvement of LLMs via Imagination, Searching, and
Criticizing
Paper
•
2404.12253
•
Published
•
55
Compression Represents Intelligence Linearly
Paper
•
2404.09937
•
Published
•
27
Flamingo: a Visual Language Model for Few-Shot Learning
Paper
•
2204.14198
•
Published
•
14
Executable Code Actions Elicit Better LLM Agents
Paper
•
2402.01030
•
Published
•
45
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language
Models
Paper
•
2406.04271
•
Published
•
29
To Believe or Not to Believe Your LLM
Paper
•
2406.02543
•
Published
•
33
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Paper
•
2406.06469
•
Published
•
25
HuatuoGPT-Vision, Towards Injecting Medical Visual Knowledge into
Multimodal LLMs at Scale
Paper
•
2406.19280
•
Published
•
62
Summary of a Haystack: A Challenge to Long-Context LLMs and RAG Systems
Paper
•
2407.01370
•
Published
•
86
Planetarium: A Rigorous Benchmark for Translating Text to Structured
Planning Languages
Paper
•
2407.03321
•
Published
•
16
Training Task Experts through Retrieval Based Distillation
Paper
•
2407.05463
•
Published
•
8
InverseCoder: Unleashing the Power of Instruction-Tuned Code LLMs with
Inverse-Instruct
Paper
•
2407.05700
•
Published
•
12
AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?
Paper
•
2407.15711
•
Published
•
9
MindSearch: Mimicking Human Minds Elicits Deep AI Searcher
Paper
•
2407.20183
•
Published
•
41
OmniParser for Pure Vision Based GUI Agent
Paper
•
2408.00203
•
Published
•
24
Amuro & Char: Analyzing the Relationship between Pre-Training and
Fine-Tuning of Large Language Models
Paper
•
2408.06663
•
Published
•
16
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper
•
2408.10914
•
Published
•
42
Building and better understanding vision-language models: insights and
future directions
Paper
•
2408.12637
•
Published
•
124
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution
Real-World Scenarios that are Difficult for Humans?
Paper
•
2408.13257
•
Published
•
26
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Paper
•
2409.01704
•
Published
•
83
ContextCite: Attributing Model Generation to Context
Paper
•
2409.00729
•
Published
•
14
Attention Heads of Large Language Models: A Survey
Paper
•
2409.03752
•
Published
•
89
A Controlled Study on Long Context Extension and Generalization in LLMs
Paper
•
2409.12181
•
Published
•
44
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic
reasoning
Paper
•
2409.12183
•
Published
•
37
Attention Prompting on Image for Large Vision-Language Models
Paper
•
2409.17143
•
Published
•
7
Can Models Learn Skill Composition from Examples?
Paper
•
2409.19808
•
Published
•
9
Law of the Weakest Link: Cross Capabilities of Large Language Models
Paper
•
2409.19951
•
Published
•
54
LLaVA-Critic: Learning to Evaluate Multimodal Models
Paper
•
2410.02712
•
Published
•
35
From Medprompt to o1: Exploration of Run-Time Strategies for Medical
Challenge Problems and Beyond
Paper
•
2411.03590
•
Published
•
10
Autoregressive Models in Vision: A Survey
Paper
•
2411.05902
•
Published
•
17
Stronger Models are NOT Stronger Teachers for Instruction Tuning
Paper
•
2411.07133
•
Published
•
35
ReFocus: Visual Editing as a Chain of Thought for Structured Image
Understanding
Paper
•
2501.05452
•
Published
•
15
Demystifying Domain-adaptive Post-training for Financial LLMs
Paper
•
2501.04961
•
Published
•
11
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse
Task Synthesis
Paper
•
2412.19723
•
Published
•
81
Are VLMs Ready for Autonomous Driving? An Empirical Study from the
Reliability, Data, and Metric Perspectives
Paper
•
2501.04003
•
Published
•
24
MLLM-as-a-Judge for Image Safety without Human Labeling
Paper
•
2501.00192
•
Published
•
25
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs)
More Self-Confident Even When They Are Wrong
Paper
•
2501.09775
•
Published
•
26
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos
Paper
•
2501.09781
•
Published
•
21