Elie Bakouch

eliebak

AI & ML interests

Training LLM's @ πŸ€—

Recent Activity

Articles

Organizations

Hugging Face's profile picture HuggingFaceBR4's profile picture Hugging Face H4's profile picture Blog-explorers's profile picture Hugging Face TB Research's profile picture huggingPartyParis's profile picture Nanotron Research's profile picture Hugging Face SMOL's profile picture MLX Community's profile picture HuggingFaceFW's profile picture LLHF's profile picture llmc's profile picture SLLHF's profile picture Argilla Warehouse's profile picture nltpt's profile picture smol-explorers's profile picture Open Science's profile picture Hugging Face Science's profile picture open/ acc's profile picture

Posts 1

view post
Post
1133
Wow, impressive 340B model by nvidia with a nice permissive license! πŸš€ The technical report is full of insights and seems to use a different learning rate schedule than cosine, probably a variant of WSD. Hope to get more info on that! πŸ‘€

nvidia/nemotron-4-340b-666b7ebaf1b3867caf2f1911