Spaces:

sonalkum
/

GAMA

Running on Zero

App Files Files Community

GAMA / hf /transformers /docs /source /en /model_doc /cpm.mdx

sonalkum

bug fix

fa57c60 9 months ago

raw

history blame contribute delete

2.45 kB

	<!--Copyright 2020 The HuggingFace Team. All rights reserved.

	Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
	the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
	an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
	specific language governing permissions and limitations under the License.
	-->

	# CPM

	## Overview

	The CPM model was proposed in [CPM: A Large-scale Generative Chinese Pre-trained Language Model](https://arxiv.org/abs/2012.00413) by Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin,
	Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen,
	Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun.

	The abstract from the paper is the following:

	*Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3,
	with 175 billion parameters and 570GB training data, drew a lot of attention due to the capacity of few-shot (even
	zero-shot) learning. However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus
	of GPT-3 is primarily English, and the parameters are not publicly available. In this technical report, we release the
	Chinese Pre-trained Language Model (CPM) with generative pre-training on large-scale Chinese training data. To the best
	of our knowledge, CPM, with 2.6 billion parameters and 100GB Chinese training data, is the largest Chinese pre-trained
	language model, which could facilitate several downstream Chinese NLP tasks, such as conversation, essay generation,
	cloze test, and language understanding. Extensive experiments demonstrate that CPM achieves strong performance on many
	NLP tasks in the settings of few-shot (even zero-shot) learning.*

	This model was contributed by [canwenxu](https://huggingface.co/canwenxu). The original implementation can be found
	here: https://github.com/TsinghuaAI/CPM-Generate

	Note: We only have a tokenizer here, since the model architecture is the same as GPT-2.

	## CpmTokenizer

	[[autodoc]] CpmTokenizer

	## CpmTokenizerFast

	[[autodoc]] CpmTokenizerFast