LoRA Fine Tuning for PTM Site Prediction

#6
by jbenbudd - opened

Hi there,
My team and I are working on training a model for PTM site prediction, and we are considering using ProLLaMA as the base model.

Would you recommend this model for such a task? Additionally, is it feasible to fine-tune ProLLaMA using LoRA for this specific problem?

We are currently getting familiar with LLaMA Factory and building an instruction dataset for fine-tuning. Any insights or recommendations would be greatly appreciated!

Hello!
Thanks for your consideration. In general, I think it is feasible to use ProLLaMA and LoRA for PTM site prediction. Below are my suggestions:

  1. You'd better fine-tune ProLLaMA_Stage_1 instead of ProLLaMA, since the latter has been already fine-tuned on downstream tasks.
  2. Please follow the formatting requirements, i.e. the protein sequence needs to be wrapped in prefix "Seq=<" and suffix ">".
  3. Using Llama-Factory is a good choice. In fact, our model, whether ProLLaMA_Stage_1 or ProLLaMA, is suitable for most LLM training tool libraries. In my github repo, I also provide a fine-tuning script.

We could keep in touch to communicate the issues you are experiencing.
Best regards!

Sign up or log in to comment