sebastiansarasti commited on
Commit
097b4b3
·
verified ·
1 Parent(s): f123fd9

updating readme file

Browse files
Files changed (1) hide show
  1. README.md +21 -1
README.md CHANGED
@@ -9,4 +9,24 @@ base_model:
9
  pipeline_tag: text-classification
10
  tags:
11
  - pytorch
12
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  pipeline_tag: text-classification
10
  tags:
11
  - pytorch
12
+ ---
13
+
14
+ # Fake Job Predictor
15
+
16
+ ## Data
17
+ 1. Data trained comes from this Kaggle repository: https://www.kaggle.com/datasets/shivamb/real-or-fake-fake-jobposting-prediction
18
+ 2. Original data size is around 18k samples. To avoid the class imbalacing problem, it was undersampled the majority class (true jobs).
19
+ 3. Final dataset used to train has a size of 4k sample.
20
+
21
+ ## Model
22
+ 1. Multi-head neural network. One head is used for each feature (description, requirements, and benefits of the job).
23
+ 2. Best metrics achieved:
24
+ - Precision: 0.83
25
+ - Recall: 0.65
26
+ - F1-score: 0.71
27
+ -
28
+ ### Components:
29
+ Text Encoder: distilbert-base-uncased is used to encode the textual input into a dense vector.
30
+
31
+ ## Future work:
32
+ Train over larger datasets and with more computer resources