ligeti commited on
Commit
cda10c4
·
verified ·
1 Parent(s): 612b816

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -1
README.md CHANGED
@@ -7,4 +7,32 @@ sdk: static
7
  pinned: false
8
  ---
9
 
10
- Edit this `README.md` markdown file to author your organization card 🔥
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  pinned: false
8
  ---
9
 
10
+ # Neural Bioinformatics Research Group - ProkBERT Models
11
+
12
+ Welcome to the official Hugging Face organization for the Neural Bioinformatics Research Group. Our main goal is to provide genomic language models for microbiome applications.
13
+
14
+ ## Models
15
+
16
+ We provide access to a collection of pretrained and fine-tuned models from the ProkBERT family. These models are built on the Local Context Aware (LCA) tokenization, specifically tailored for DNA sequences to balance context size and performance.
17
+
18
+ ProkBERT models are designed for microbiome-related tasks, such as prokaryote promoter identification or phage detection. Despite their compact size, they are powerful and efficient.
19
+
20
+ ## Model Overview
21
+
22
+ | Model | Parameters | Tokenizer | Layers | Attention Heads | Max. Context Size | Training Data |
23
+ |---------------|------------|------------------|--------|-----------------|-------------------|---------------------|
24
+ | `mini` | 20.6M | 6-mer, shift=1 | 6 | 6 | 1027 nt | 206.65 billion |
25
+ | `mini-c` | 24.9M | 1-mer | 6 | 6 | 1022 nt | 206.65 billion |
26
+ | `mini-long` | 26.6M | 6-mer, shift=2 | 6 | 6 | 4096 nt | 206.65 billion |
27
+
28
+ _A comprehensive overview of model parameters across varied configurations._
29
+
30
+ ## Resources
31
+
32
+ - [Read our paper](https://www.frontiersin.org/articles/10.3389/fmicb.2023.1331233/full)
33
+ - [Learn more about the model](https://github.com/nbrg-ppcu/prokbert)
34
+ - [Get started with code on GitHub](https://github.com/nbrg-ppcu/prokbert)
35
+
36
+ ---
37
+
38
+ For more information or questions, please visit our [GitHub repository](https://github.com/nbrg-ppcu/prokbert) or contact us at [email]([email protected]).