ligeti commited on
Commit
e0e68ba
·
verified ·
1 Parent(s): 7e1c1a4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -7
README.md CHANGED
@@ -108,15 +108,41 @@ except ImportError:
108
  - **Masked Language Modeling (MLM):** The MLM objective was modified for genomic sequences for masking overlapping k-mers.
109
  - **Training Phases:** The model underwent initial training with complete sequence restoration and selective masking, followed by a succeeding phase with variable-length datasets for increased complexity.
110
 
111
- ### Evaluation Results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
112
 
113
- | Metric | Result | Notes |
114
- |-------------------------|--------------|-------|
115
- | Metric 1 (e.g., Accuracy) | To be filled | |
116
- | Metric 2 (e.g., Precision) | To be filled | |
117
- | Metric 3 (e.g., Recall) | To be filled | |
118
 
119
- *Additional details and metrics can be included as they become available.*
120
 
121
  ### Ethical Considerations and Limitations
122
 
 
108
  - **Masked Language Modeling (MLM):** The MLM objective was modified for genomic sequences for masking overlapping k-mers.
109
  - **Training Phases:** The model underwent initial training with complete sequence restoration and selective masking, followed by a succeeding phase with variable-length datasets for increased complexity.
110
 
111
+ ### Evaluation Results for ProkBERT-mini
112
+
113
+ | Model | L | Avg. Ref. Rank | Avg. Top1 | Avg. Top3 | Avg. AUC |
114
+ |-------------------|------|----------------|-----------|-----------|-----------|
115
+ | ProkBERT-mini | 128 | 0.9315 | 0.4497 | 0.8960 | 0.9998 |
116
+ | ProkBERT-mini | 256 | 0.8433 | 0.4848 | 0.9130 | 0.9998 |
117
+ | ProkBERT-mini | 512 | 0.8098 | 0.5056 | 0.9179 | 0.9998 |
118
+ | ProkBERT-mini | 1024 | 0.7825 | 0.5169 | 0.9227 | 0.9998 |
119
+
120
+
121
+ *Masking performance of the ProkBERT family.*
122
+
123
+ ### Evaluation of Promoter Prediction Tools on E-coli Sigma70 Dataset
124
+
125
+ | Tool | Accuracy | MCC | Sensitivity | Specificity |
126
+ |-----------------------|----------|-------|-------------|-------------|
127
+ | ProkBERT-mini | **0.87** | **0.74** | 0.90 | 0.85 |
128
+ | ProkBERT-mini-c | **0.87** | 0.73 | 0.88 | 0.85 |
129
+ | ProkBERT-mini-long | **0.87** | **0.74** | 0.89 | 0.85 |
130
+ | CNNProm | 0.72 | 0.50 | 0.95 | 0.51 |
131
+ | iPro70-FMWin | 0.76 | 0.53 | 0.84 | 0.69 |
132
+ | 70ProPred | 0.74 | 0.51 | 0.90 | 0.60 |
133
+ | iPromoter-2L | 0.64 | 0.37 | 0.94 | 0.37 |
134
+ | Multiply | 0.50 | 0.05 | 0.81 | 0.23 |
135
+ | bTSSfinder | 0.46 | -0.07 | 0.48 | 0.45 |
136
+ | BPROM | 0.56 | 0.10 | 0.20 | 0.87 |
137
+ | IBPP | 0.50 | -0.03 | 0.26 | 0.71 |
138
+ | Promotech | 0.71 | 0.43 | 0.49 | **0.90** |
139
+ | Sigma70Pred | 0.66 | 0.42 | 0.95 | 0.41 |
140
+ | iPromoter-BnCNN | 0.55 | 0.27 | **0.99** | 0.18 |
141
+ | MULTiPly | 0.54 | 0.19 | 0.92 | 0.22 |
142
+
143
+ *The ProkBERT family models exhibit remarkably consistent performance across the metrics assessed. With respect to accuracy, all three tools achieve an impressive score of 0.87, marking them among the top performers in promoter prediction. This suggests that, regardless of the specific version, the underlying methodology used in the mini series is robust and effective.*
144
 
 
 
 
 
 
145
 
 
146
 
147
  ### Ethical Considerations and Limitations
148