lamm-mit
/

BioinspiredMixtral

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

BioinspiredMixtral / README.md

mjbuehler's picture

Update README.md

ed86dcb verified 10 months ago

|

2.8 kB

	---
	license: apache-2.0
	---
	### BioinspiredMixtral: Large Language Model for the Mechanics of Biological and Bio-Inspired Materials using Mixture-of-Experts

	To accelerate discovery and guide insights, we report an open-source autoregressive transformer large language model (LLM), trained on expert knowledge in the biological materials field, especially focused on mechanics and structural properties.

	The model is finetuned with a corpus of over a thousand peer-reviewed articles in the field of structural biological and bio-inspired materials and can be prompted to recall information, assist with research tasks, and function as an engine for creativity.

	The model is based on mistralai/Mixtral-8x7B-Instruct-v0.1.

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/623ce1c6b66fedf374859fe7/K0GifLVENb8G0nERQAzeQ.png)

	This model is based on work reported in https://doi.org/10.1002/advs.202306724, but uses a mixture-of-experts strategy.

	```
	from llama_cpp import Llama

	model_path='lamm-mit/BioinspiredMixtral/ggml-model-q5_K_M.gguf'
	chat_format="mistral-instruct"

	llm = Llama(model_path=model_path,
	n_gpu_layers=-1,verbose= True,
	n_ctx=10000,
	#main_gpu=0,
	chat_format=chat_format,
	#split_mode=llama_cpp.LLAMA_SPLIT_LAYER
	)
	```

	Or, download directly from Hugging Face:

	```
	from llama_cpp import Llama

	model_path='lamm-mit/BioinspiredMixtral/ggml-model-q5_K_M.gguf'
	chat_format="mistral-instruct"

	llm = Llama.from_pretrained(
	repo_id=model_path,
	filename="*q5_K_M.gguf",
	verbose=True,
	n_gpu_layers=-1,
	n_ctx=10000,
	#main_gpu=0,
	chat_format=chat_format,
	)
	```
	For inference:
	```
	def generate_BioMixtral (system_prompt='You are an expert in biological materials, mechanics and related topics.', prompt="What is spider silk?",
	temperature=0.0,
	max_tokens=10000,
	):

	if system_prompt==None:
	messages=[
	{"role": "user", "content": prompt},
	]
	else:
	messages=[
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": prompt},
	]

	result=llm.create_chat_completion(
	messages=messages,
	temperature=temperature,
	max_tokens=max_tokens,
	)

	start_time = time.time()
	result=generate_BioMixtral(system_prompt='You respond accurately.',
	prompt="What is graphene? Answer with detail.",
	max_tokens=512, temperature=0.7, )
	print (result)
	deltat=time.time() - start_time
	print("--- %s seconds ---" % deltat)
	toked=tokenizer(res)
	print ("Tokens per second (generation): ", len (toked['input_ids'])/deltat)
	```

	arXiv: https://arxiv.org/abs/2309.08788