Papers
arxiv:2405.13386

360Zhinao Technical Report

Published on May 22, 2024
Authors:

Abstract

We present 360Zhinao models with 7B parameter size and context lengths spanning 4K, 32K and 360K, all available at https://github.com/Qihoo360/360zhinao. For rapid development in pretraining, we establish a stable and sensitive ablation environment to evaluate and compare experiment runs with minimal model size. Under such guidance, we perfect our data cleaning and composition strategies to pretrain 360Zhinao-7B-Base on 3.4T tokens. We also mainly emphasize data during alignment, where we strive to balance quantity and quality with filtering and reformatting. With tailored data, 360Zhinao-7B's context window is easily extended to 32K and 360K. RMs and RLHF are trained following SFT and credibly applied to specific tasks. All together these contributions lead to 360Zhinao-7B's competitive performance among models of similar size.

Community

Sign up or log in to comment

Models citing this paper 7

Browse 7 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2405.13386 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2405.13386 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.