RedPajama: an Open Dataset for Training Large Language Models Paper โข 2411.12372 โข Published Nov 19, 2024 โข 51