davidsvaughn
's Collections
Large Language Model Alignment: A Survey
Paper
•
2309.15025
•
Published
•
2
Aligning Large Language Models with Human: A Survey
Paper
•
2307.12966
•
Published
•
1
Direct Preference Optimization: Your Language Model is Secretly a Reward
Model
Paper
•
2305.18290
•
Published
•
49
SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to
RLHF
Paper
•
2310.05344
•
Published
•
1
LIMA: Less Is More for Alignment
Paper
•
2305.11206
•
Published
•
21
Aligning Large Language Models through Synthetic Feedback
Paper
•
2305.13735
•
Published
•
1
Generative Judge for Evaluating Alignment
Paper
•
2310.05470
•
Published
•
1
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Paper
•
2310.17631
•
Published
•
33
Quality-Diversity through AI Feedback
Paper
•
2310.13032
•
Published
•
1
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper
•
2201.11903
•
Published
•
9
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Paper
•
2203.11171
•
Published
•
3
Fine-tuning Language Models with Generative Adversarial Feedback
Paper
•
2305.06176
•
Published
•
1
UltraFeedback: Boosting Language Models with High-quality Feedback
Paper
•
2310.01377
•
Published
•
5
Verbosity Bias in Preference Labeling by Large Language Models
Paper
•
2310.10076
•
Published
•
2
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI
Feedback
Paper
•
2309.00267
•
Published
•
47
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language
Models' Alignment
Paper
•
2308.05374
•
Published
•
27
Red-Teaming Large Language Models using Chain of Utterances for
Safety-Alignment
Paper
•
2308.09662
•
Published
•
3
HelpSteer: Multi-attribute Helpfulness Dataset for SteerLM
Paper
•
2311.09528
•
Published
•
2
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
Language Models
Paper
•
2402.03300
•
Published
•
72
ReFT: Reasoning with Reinforced Fine-Tuning
Paper
•
2401.08967
•
Published
•
29
Reasons to Reject? Aligning Language Models with Judgments
Paper
•
2312.14591
•
Published
•
17
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with
Code-based Self-Verification
Paper
•
2308.07921
•
Published
•
22
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving
Paper
•
2309.17452
•
Published
•
3
LLM Guided Inductive Inference for Solving Compositional Problems
Paper
•
2309.11688
•
Published
•
1
A Systematic Survey of Prompt Engineering in Large Language Models:
Techniques and Applications
Paper
•
2402.07927
•
Published
•
1
A Review of Sparse Expert Models in Deep Learning
Paper
•
2209.01667
•
Published
•
3
The What, Why, and How of Context Length Extension Techniques in Large
Language Models -- A Detailed Survey
Paper
•
2401.07872
•
Published
•
2
Knowledge Solver: Teaching LLMs to Search for Domain Knowledge from
Knowledge Graphs
Paper
•
2309.03118
•
Published
•
2
Fabricator: An Open Source Toolkit for Generating Labeled Training Data
with Teacher LLMs
Paper
•
2309.09582
•
Published
•
4
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large
Language Models by Extrapolating Errors from Small Models
Paper
•
2310.13671
•
Published
•
18
Training Generative Question-Answering on Synthetic Data Obtained from
an Instruct-tuned Model
Paper
•
2310.08072
•
Published
•
1
Generative Data Augmentation using LLMs improves Distributional
Robustness in Question Answering
Paper
•
2309.06358
•
Published
•
1
Self-Alignment with Instruction Backtranslation
Paper
•
2308.06259
•
Published
•
41
A Comprehensive Analysis of Adapter Efficiency
Paper
•
2305.07491
•
Published
•
1
Empirical Analysis of the Strengths and Weaknesses of PEFT Techniques
for LLMs
Paper
•
2304.14999
•
Published
•
2
Comparison between parameter-efficient techniques and full fine-tuning:
A case study on multilingual news article classification
Paper
•
2308.07282
•
Published
•
1
LoRA: Low-Rank Adaptation of Large Language Models
Paper
•
2106.09685
•
Published
•
30
QLoRA: Efficient Finetuning of Quantized LLMs
Paper
•
2305.14314
•
Published
•
46
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for
Language Models
Paper
•
2402.13064
•
Published
•
47
MTEB: Massive Text Embedding Benchmark
Paper
•
2210.07316
•
Published
•
6
Beyond Scale: the Diversity Coefficient as a Data Quality Metric
Demonstrates LLMs are Pre-trained on Formally Diverse Data
Paper
•
2306.13840
•
Published
•
11
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
Paper
•
2212.07919
•
Published
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper
•
2307.09288
•
Published
•
243
Training Verifiers to Solve Math Word Problems
Paper
•
2110.14168
•
Published
•
4
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning
Challenge
Paper
•
1803.05457
•
Published
•
2
WinoGrande: An Adversarial Winograd Schema Challenge at Scale
Paper
•
1907.10641
•
Published
•
1
Beyond the Imitation Game: Quantifying and extrapolating the
capabilities of language models
Paper
•
2206.04615
•
Published
•
5
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Paper
•
2303.12712
•
Published
•
2
Deduplicating Training Data Makes Language Models Better
Paper
•
2107.06499
•
Published
•
4
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
•
2402.17764
•
Published
•
603
Oasis: Data Curation and Assessment System for Pretraining of Large
Language Models
Paper
•
2311.12537
•
Published
•
1
LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large
Language Models
Paper
•
2402.10524
•
Published
•
22
Reflexion: Language Agents with Verbal Reinforcement Learning
Paper
•
2303.11366
•
Published
•
4
Paper
•
2303.08774
•
Published
•
5
Beyond Language Models: Byte Models are Digital World Simulators
Paper
•
2402.19155
•
Published
•
49
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of
Diverse Models
Paper
•
2404.18796
•
Published
•
68