Safetensors
neo_scalinglaw_250M / README.md
dododododo's picture
Update README.md
f8cb2d8 verified
metadata
license: apache-2.0

NEO

🤗Neo-Models | 🤗Neo-Datasets | Github

Neo is a completely open source large language model, including code, all model weights, datasets used for training, and training details.

Model

Model Describe Download
neo_7b This repository contains the base model of neo_7b • 🤗 Hugging Face
neo_7b_sft_v0.1 This repository contains the supervised fine-tuning version of the neo_7b model. • 🤗 Hugging Face
neo_7b_instruct_v0.1 This repository contains the instruction-tuned version of the neo_7b model. • 🤗 Hugging Face
neo_7b_intermediate This repo contains normal pre-training intermediate ckpts. A total of 3.7T tokens were learned at this phase. • 🤗 Hugging Face
neo_7b_decay This repo contains intermediate ckpts during the decay phase. A total of 720B tokens were learned at this phase. • 🤗 Hugging Face
neo_scalinglaw_980M This repo contains ckpts related to scalinglaw experiments • 🤗 Hugging Face
neo_scalinglaw_460M This repo contains ckpts related to scalinglaw experiments • 🤗 Hugging Face
neo_scalinglaw_250M This repo contains ckpts related to scalinglaw experiments • 🤗 Hugging Face
neo_2b_general This repo contains ckpts of 2b model trained using common domain knowledge • 🤗 Hugging Face

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = '<your-hf-model-path-with-tokenizer>'

tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False, trust_remote_code=True)

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    device_map="auto",
    torch_dtype='auto'
).eval()

input_text = "A long, long time ago,"

input_ids = tokenizer(input_text, add_generation_prompt=True, return_tensors='pt').to(model.device)
output_ids = model.generate(**input_ids, max_new_tokens=20)
response = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(response)

Citation

@article{zhang2024mapneo,
    title   = {MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series},
    author  = {Ge Zhang and Scott Qu and Jiaheng Liu and Chenchen Zhang and Chenghua Lin and Chou Leuang Yu and Danny Pan and Esther Cheng and Jie Liu and Qunshu Lin and Raven Yuan and Tuney Zheng and Wei Pang and Xinrun Du and Yiming Liang and Yinghao Ma and Yizhi Li and Ziyang Ma and Bill Lin and Emmanouil Benetos and Huan Yang and Junting Zhou and Kaijing Ma and Minghao Liu and Morry Niu and Noah Wang and Quehry Que and Ruibo Liu and Sine Liu and Shawn Guo and Soren Gao and Wangchunshu Zhou and Xinyue Zhang and Yizhi Zhou and Yubo Wang and Yuelin Bai and Yuhan Zhang and Yuxiang Zhang and Zenith Wang and Zhenzhu Yang and Zijian Zhao and Jiajun Zhang and Wanli Ouyang and Wenhao Huang and Wenhu Chen},
    year    = {2024},
    journal = {arXiv preprint arXiv: 2405.19327}
}