metadata

license: gpl-3.0
language:
  - zh
  - en
pipeline_tag: text-generation
tags:
  - translation
  - multilingual
  - large language model
  - instruction tuning

BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models

BayLing (百聆, bǎi líng) is an instruction-following LLM equipped with advanced language alignment, showing superior capability in English/Chinese generation, instruction following and multi-turn interaction. BayLing can be effortlessly deployed on a consumer-grade GPU with 16GB of memory, and assists users with tasks such as translation, writing, creation, suggestion...

This model is the weight-diff version of BayLing-13B-v1.0.

BayLing-13B-v1.1 has been released, BayLing-13B-v1.1 is additionally injected with extensive Chinese knowledge compared with this model.

👇 Learn more about BayLing:

💬 Demo: Welcome to apply for a trial of BayLing's online demo (beta version).

📄 Paper: A comprehensive research paper of BayLing.

🏠 Homepage: BayLing's homepage. You can discover more information and cases of BayLing here.

✍️ BayLing-80 Test Set: A human-annotated evaluation set comprising multi-turn instructions in both English and Chinese, can be used to evaluate the multilingual and multi-turn interaction capabilities of LLMs.

🤗 Model: The weight-diff version of BayLing-7B and BayLing-13B, you can quickly get the parameters of BayLing through apply_delta.py. The HF models of BayLing are anonymized version (exclude BayLing's name in its knowledge), in order to facilitate future LLMs to build upon BayLing.

BayLing is developed by NLP Group of Institute of Computing Technology, Chinese Academy of Sciences (ICT/CAS)

BayLing is continuously optimizing 🆙 If you have any suggestions, please contact [email protected]. Thanks for your support!

Refer to our Github Repo for the detailed introduction to BayLing, including deploying BayLing, interacting with BayLing and BayLing's performance.

Limitations

Despite demonstrating commendable performance in certain aspects, BayLing still exhibits several limitations. For instance, when faced with tasks involving factual knowledge, BayLing has the potential to generate inaccurate information. Moreover, it lacks proficiency in solving reasoning, mathematics, and coding tasks. Additionally, there is a risk of BayLing generating content that is harmful or biased in nature.

BayLing is a large language model that, like any other language model, cannot guarantee the absolute accuracy of the generated content. Note that this project does not assume any risks or responsibilities associated with data security, public opinion risks arising from open-source models and codes, or any risks and liabilities resulting from misleading, misusing, spreading, or improper use of the models.

License

Model weights (delta version) and the inference code are released under The GNU General Public License v3.0 (GPLv3). The online demo serves as a research preview and is exclusively intended for non-commercial usage, subject to the Model License of LLaMA, Terms of Use of the data generated by OpenAI, and Privacy Practices of ShareGPT and Data License of WMT22.

Acknowledgements

We would like to express our gratitude to all those who have contributed to BayLing. We extend special thanks to Ms. Xiaohong Wang for her valuable comments and suggestions on the use of InforSuperBahn MLOps, and for her organizational and resource support in providing computing resources and showcasing BayLing. We also acknowledge Xiaodong Liu for his pivotal role in the construction of the distributed system and overall coordination of the demo deployment. Furthermore, we appreciate the contribution of the development team from the Nanjing Institute of InforSuperBahn in maintaining the computing resources and creating the display interface for BayLing’s webpage and demo.

Authors

| Yunji Chen | Xilin Chen | Yang Feng * |

Citation

If our work is helpful for you, please cite as:

@article{bayling,
      title={BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models}, 
      author={Shaolei Zhang and Qingkai Fang and Zhuocheng Zhang and Zhengrui Ma and Yan Zhou and Langlin Huang and Mengyu Bu and Shangtong Gui and Yunji Chen and Xilin Chen and Yang Feng},
      journal={arXiv preprint arXiv:2306.10968},
      year={2023},
      url={https://arxiv.org/abs/2306.10968}
}