ALmonster
/

ChemGPT2-QA-72B

Model card Files Files and versions Community

ChemGPT2-QA-72B / README.md

ALmonster's picture

Update README.md

2c257e4 verified 4 months ago

|

history blame contribute delete

4.12 kB

	---
	license: mit
	language:
	- zh
	- en
	---

	We fine-tuned our ChemGPT2-QA-72B based on the Qwen2-72B-Instruct model. Our training data, ChemGPT-2.0-Data, has been open-sourced and is available at https://huggingface.co/datasets/ALmonster/ChemGPT-2.0-Data.
	We evaluated our model on the three chemistry tasks of C-Eval and compared it with GPT-3.5 and GPT-4. The results are as follows:


	## C-Eval

	\| Models \| college_chemistry \| high_school_chemistry \| middle_school_chemistry \| AVG \|
	\|--------\|-------------------\|-----------------------\|-------------------------\|-----\|
	\| GPT-3.5 \| 0.397 \| 0.529 \| 0.714 \| 0.54666667 \|
	\| GPT4 \| 0.594 \| 0.558 \| 0.811 \| 0.65433333 \|
	\| chemgpt\| 0.71 \| 0.936 \| 0.995 \| 0.88033333 \|


	## Quickstart

	Here provides a code snippet with `apply_chat_template` to show you how to load the tokenizer and model and how to generate contents.

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	device = "cuda" # the device to load the model onto

	model = AutoModelForCausalLM.from_pretrained(
	"ALmonster/ChemGPT2-QA-72B",
	torch_dtype="auto",
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained("ALmonster/ChemGPT2-QA-72B")

	prompt = "Give me a short introduction to large language model."
	messages = [
	{"role": "system", "content": "You are a helpful assistant."},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(device)

	generated_ids = model.generate(
	model_inputs.input_ids,
	max_new_tokens=512
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	```

	## VLLM

	We recommend deploying our model using 4 A100 GPUs. You can run the vllm server-side with the following code in terminal:

	```python
	python -m vllm.entrypoints.openai.api_server --served-model-name chemgpt --model path/to/chemgpt --gpu-memory-utilization 0.98 --tensor-parallel-size 4 --port 6000
	```

	Then, you can use the following code to deploy client-side:

	```python
	import requests
	import json

	def general_chemgpt_stream(inputs,history):
	url = 'http://loaclhost:6000/v1/chat/completions'

	history+=[{"role": "user", "content": inputs},]

	data = {
	"model": "chemgpt",
	"messages": history,
	}

	headers = {
	'Content-Type': 'application/json'
	}

	response = requests.post(url, headers=headers, data=json.dumps(data))

	headers = {"User-Agent": "vLLM Client"}

	pload = {
	"model": "chemgpt",
	"stream": True,
	"messages": history
	}
	response = requests.post(url,
	headers=headers,
	json=pload,
	stream=True)

	for chunk in response.iter_lines(chunk_size=1,
	decode_unicode=False,
	delimiter=b"\n"):
	if chunk:
	string_data = chunk.decode("utf-8")
	try:
	json_data = json.loads(string_data[6:])
	delta_content = json_data["choices"][0]["delta"]["content"]
	assistant_reply+=delta_content
	yield delta_content
	except KeyError as e:
	delta_content = json_data["choices"][0]["delta"]["role"]
	except json.JSONDecodeError as e:
	history+=[{
	"role": "assistant",
	"content": assistant_reply,
	"tool_calls": []
	},]
	delta_content='[DONE]'
	assert '[DONE]'==chunk.decode("utf-8")[6:]

	inputs='介绍一下NaoH'
	history_chem=[]
	for response_text in general_chemgpt_stream(inputs,history_chem):
	print(response_text,end='')
	```