CausalLM
/

35b-beta2ep

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

35b-beta2ep / README.md

JosephusCheung's picture

Update README.md

5275e6e verified 9 months ago

|

history blame contribute delete

1.06 kB

	---
	license: gpl-3.0
	language:
	- en
	- zh
	- ja
	- de
	datasets:
	- JosephusCheung/GuanacoDataset
	- meta-math/MetaMathQA
	- jondurbin/airoboros-3.1
	- WizardLM/WizardLM_evol_instruct_V2_196k
	- RyokoAI/ShareGPT52K
	- RyokoAI/Fandom23K
	- milashkaarshif/MoeGirlPedia_wikitext_raw_archive
	- wikipedia
	- wiki_lingua
	- garage-bAInd/Open-Platypus
	- LDJnr/Puffin
	- BAAI/COIG
	- TigerResearch/tigerbot-zhihu-zh-10k
	- liwu/MNBVC
	- teknium/openhermes
	- CausalLM/Refined-Anime-Text
	- microsoft/orca-math-word-problems-200k
	- m-a-p/CodeFeedback-Filtered-Instruction
	---
	Tokenizer is different from cohere - and chat template is ChatML - fully fine-tuned at 128K+ ~ 30M entries long, web crawl input, GPT-4-32k/3.5-16k output, synthetic dataset - 1 epoch

	For another candidate version of 1 epoch - https://huggingface.co/CausalLM/35b-beta - somehow less overfitting?

	No loras, no quants, no tricks.

	This one is not "very 128k", use https://huggingface.co/CausalLM/35b-beta-long for long context. But better in general tasks, knowledge, coding and so on.

	And, merge them if you want!