Commit History
Delete Dockerfile
fa92474
verified
Rename Dockerfile-cloud to Dockerfile
8f9180d
verified
Upload Dockerfile-cloud
58a9a38
verified
Update README.md
9ea8e9a
verified
Update README.md
d304116
verified
drop length column for issues with eval without packing (#1711)
3f1f5e3
unverified
download model weights on preprocess step (#1693)
5783839
unverified
verbose failure message (#1694)
cbbf039
unverified
bump deepspeed for fix for grad norm compute putting tensors on different devices (#1699)
851ccb1
unverified
fix for when sample_packing and eval_sample_packing are different (#1695)
18cabc0
unverified
add back packing efficiency estimate so epochs and multi-gpu works properly (#1697)
ed8ef65
unverified
add qwen2-72b fsdp example (#1696)
00ac302
unverified
ensure explicit eval_sample_packing to avoid mismatch issues (#1692)
9c1af1a
unverified
Create phi3-ft-fsdp.yml (#1580)
a82a711
unverified
Phi-3 conversation format, example training script and perplexity metric (#1582)
cf64284
unverified
add support for rpo_alpha (#1681)
c996881
unverified
re-enable DPO for tests in modal ci (#1374)
1f151c0
unverified
Fix the broken link in README (#1678) [skip ci]
5cde065
unverified
need to add back drop_last for sampler (#1676)
05b0bd0
unverified
cleanup the deepspeed proxy model at the end of training (#1675)
d4f6c65
unverified
load explicit splits on datasets (#1652)
a944f7b
unverified
set chat_template in datasets config automatically (#1664)
9d4225a
unverified
use mixins for orpo and kto configs so they work with axolotl customizations (#1674)
f7332ac
unverified
re-enable phi for tests in modal ci (#1373)
16d46b7
unverified
revert multipack batch sampler changes (#1672)
a6b37bd
unverified
handle the system role too for chat templates (#1671)
b752080
unverified
make sure the CI fails when pytest script fails (#1669)
fe650dd
unverified
Fix README quick start example usage model dirs (#1668)
49b967b
unverified
Abe Voelker
commited on
Correct name of MixtralBlockSparseTop2MLP (L -> l) (#1667)
65db903
unverified
Fix: ensure correct handling of `val_set_size` as `float` or `int` (#1655)
6a5a725
unverified
fix lint issue that snuck through (#1665)
f5febc7
unverified
Fix Lora config error for Llama3 (#1659)
230e0ac
unverified
Generalizing the chat_template prompt strategy (#1660) [skip ci]
cc11c6b
unverified
Fix Google Colab notebook 2024-05 (#1662) [skip ci]
5f91064
unverified
Maciek
commited on
update deps (#1663) [skip ci]
ef22351
unverified
document how to use `share_strategy="no"` (#1653) [skip ci]
8a20a7b
unverified
support for custom messages field in sharegpt (#1651)
bbfed31
unverified
Update tiny-llama qlora.yml addressing eval packing error (#1638)
84bb806
unverified
Jaydeep Thik
commited on