genbio-ai/AIDO.RNA-650M · Could you please make the tokenizer

27 days ago

Hello, I am a student who is working on a project about efficient fine-tuning of parameters of a base model, which involves modification of the model structure. Is it possible to open source it further?

probablybots

GenBio AI org 26 days ago

•

edited 26 days ago

Hi @DualK , the model architectures and tokenizers are fully open-sourced on GitHub https://github.com/genbio-ai/ModelGenerator.

Here is the vocabulary you're looking for https://github.com/genbio-ai/ModelGenerator/blob/main/modelgenerator/huggingface_models/rnabert/vocab.txt.

We also have HF Transformers LoRA PEFT enabled in ModelGenerator as well, with some nice low-memory checkpointing behavior. See this example command https://genbio-ai.github.io/ModelGenerator/quick_start/#use-lora-for-parameter-efficient-finetuning. Feel free to make PRs to the ModelGenerator repo if you design PEFT techniques you'd like to provide to the community.

probablybots changed discussion status to closed 26 days ago

genbio-ai
/

AIDO.RNA-650M

Could you please make the tokenizer_config.json/vocab.txt file public?