manojpreveen
/

gpt-neoxt-20b-v4

Text Generation

text-generation-inference

Model card Files Files and versions Community

gpt-neoxt-20b-v4 / README.md

manojpreveen's picture

Duplicate from iamplus/gpt-neoxt-20b-v4

0c1b5e1 verified about 1 year ago

|

history blame contribute delete

784 Bytes

	---
	license: bigscience-openrail-m
	datasets:
	- iamplus/Instruction_Tuning
	---
	Instruction Tuned GPT-NeoXT-20B model on Instruction Tuning dataset as listed below (~560k data) using *Colossal AI*

	Base Model: togethercomputer/GPT-NeoXT-Chat-Base-20B (GPT-NeoXT-Chat-Base-20B-v0.16 - fine-tuned on feedback data)

	Training Details :
	* Epochs: 2
	* Batch Size : 16 instantaneous per device x 1 gradient accumulation steps x 8 gpus = 128
	* Max Length : 1024
	* Weight Decay : 0
	* Learning Rate : 2e-5
	* Learning Rate Scheduler Type : Cosine
	* Number of warmup steps : 240
	* Machine : 8xA100 80GB

	Dataset Details :

	Dataset : iamplus/Instruction_Tuning

	Files :
	* stanford_alpaca_it_v2.csv
	* ColossalChat.csv
	* unified_chip2.csv
	* iamai_summarization_v1.csv
	* iamai_v1.csv