BlackSamorez
/

HuYaLM-100B-fp16

Model card Files Files and versions Community

HuYaLM-100B-fp16 / README.md

BlackSamorez's picture

readme update

69c25a8 over 1 year ago

|

1.32 kB

	---
	language:
	- en
	- ru
	license: apache-2.0


	tags:
	- gpt
	- NLG

	---
	# HuYaLM 100B

	Hugging Face YaLM 100B (by [BlackSamorez](https://github.com/BlackSamorez)) is a _transformers_ compatible implementation of YaLM 100B model, originally trained by Yandex for 65 days on a cluster of 800 A100 graphics cards and 1.7 TB of online texts, books, and countless other sources in both English and Russian.

	This particular implementation was motivated by the fact that the model was originally published with outdated code, incompatible with latest achievemnt in the field. This code, being compatible with _transformers_, should automatically support much needed features such as [quantization](https://huggingface.co/docs/transformers/main_classes/quantization) and [adapter training](https://huggingface.co/docs/peft/index).

	Training details and best practices on acceleration and stabilizations can be found on [Medium](https://medium.com/p/d1df53d0e9a6) (English) and [Habr](https://habr.com/ru/company/yandex/blog/672396/) (Russian) articles. The original code published by Yandex can be found on [GitHub](https://github.com/yandex/YaLM-100B).

	This code, as well as the model itself, is published under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) license, permitting commercial use.