adnaan05 commited on
Commit
60cec19
Β·
verified Β·
1 Parent(s): df2ed6d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +102 -14
README.md CHANGED
@@ -9,22 +9,110 @@ app_file: app.py
9
  pinned: false
10
  license: mit
11
  ---
 
12
 
13
- # TrueCheck - Fake News Detection System
14
 
15
- This application uses a hybrid deep learning model (BERT + BiLSTM + Attention) to detect fake news articles.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- ## Features
18
- - Real-time fake news detection
19
- - Confidence scoring
20
- - Attention visualization
21
- - Interactive web interface
 
 
 
 
22
 
23
- ## Model Architecture
24
- The model combines:
25
- - BERT for contextual embeddings
26
- - BiLSTM for sequence modeling
27
- - Attention mechanism for interpretability
 
28
 
29
- ## Usage
30
- Simply paste a news article into the text area and click "Analyze" to get predictions with confidence scores and attention visualizations.
 
9
  pinned: false
10
  license: mit
11
  ---
12
+ # TruthCheck: Fake News Detection with Fine-Tuned BERT
13
 
14
+ TruthCheck is an advanced fake news detection system leveraging a hybrid deep learning architecture. It combines a pre-trained BERT-base-uncased model with a BiLSTM and attention mechanism, fully fine-tuned on a curated dataset of real and fake news. The project includes robust preprocessing, feature extraction, model training, evaluation, and a Streamlit web app for interactive predictions.
15
 
16
+ ---
17
+
18
+ ## πŸš€ Features
19
+ - **Hybrid Model:** BERT-base-uncased + BiLSTM + Attention
20
+ - **Full Fine-Tuning:** All layers of BERT and additional layers are trainable and optimized on the fake news dataset
21
+ - **Comprehensive Preprocessing:** Cleaning, tokenization, lemmatization, and more
22
+ - **Training & Evaluation:** Scripts for training, validation, and test evaluation
23
+ - **Interactive App:** Streamlit web app for real-time news classification
24
+ - **Ready for Deployment:** Easily extendable for research or production
25
+
26
+ ---
27
+
28
+ ## 🧠 Model Details
29
+ - **Base Model:** [BERT-base-uncased](https://huggingface.co/bert-base-uncased)
30
+ - **Architecture:**
31
+ - BERT encoder (pre-trained, all layers fine-tuned)
32
+ - BiLSTM layer for sequential context
33
+ - Attention mechanism for interpretability
34
+ - Fully connected classification head
35
+ - **Fine-Tuning Technique:**
36
+ - All BERT layers are unfrozen and updated during training (full fine-tuning)
37
+ - Additional layers (BiLSTM, attention, classifier) are trained from scratch
38
+
39
+ ---
40
+
41
+ ## πŸ“₯ Download Data and Model
42
+
43
+ **Raw and Processed Datasets:**
44
+ [Google Drive Link](https://drive.google.com/drive/folders/1tAhWhhhDes5uCdcnMLmJdFBSGWFFl55M?usp=sharing)
45
+
46
+ **Trained Model(s):**
47
+ [Google Drive Link](https://drive.google.com/drive/folders/1VEFa0y_vW6AzT5x0fRwmX8shoBhUGd7K?usp=sharing)
48
+
49
+ ### **Instructions:**
50
+ 1. Download the datasets and place them in the `data/` directory:
51
+ - `data/raw/` for raw files
52
+ - `data/processed/` for processed files
53
+ 2. Download the trained model (e.g., `final_model.pt` or `best_model.pt`) and place it in `models/saved/`.
54
+
55
+ ---
56
+
57
+ ## βš™οΈ Setup
58
+
59
+ 1. **Clone the repository:**
60
+ ```bash
61
+ git clone https://github.com/adnaan-tariq/fake-news-detection.git
62
+ cd fake-news-detection
63
+ ```
64
+ 2. **Create and activate a virtual environment:**
65
+ ```bash
66
+ python -m venv venv
67
+ .\venv\Scripts\activate
68
+ ```
69
+ 3. **Install dependencies:**
70
+ ```bash
71
+ pip install --upgrade pip
72
+ pip install -r requirements.txt
73
+ ```
74
+
75
+ ---
76
+
77
+ ## πŸƒβ€β™‚οΈ Usage
78
+
79
+ ### **Train the Model**
80
+ If you want to train from scratch (after placing the data as described above):
81
+ ```bash
82
+ python -m src.train
83
+ ```
84
+
85
+ ### **Run the Streamlit App**
86
+ ```bash
87
+ streamlit run app.py
88
+ ```
89
+ - Open [http://localhost:8501](http://localhost:8501) in your browser.
90
+
91
+ ### **Test the Model**
92
+ - The app and scripts will use the model in `models/saved/final_model.pt` by default.
93
+ - For custom inference, see the example in `src/app.py` or ask for a sample script.
94
+
95
+ ---
96
+
97
+ ## πŸ“Š Results
98
+ - **Validation Accuracy:** ~93%
99
+ - **Validation F1 Score:** ~0.93
100
+ - (See training logs and visualizations for more details.)
101
 
102
+ ---
103
+
104
+ ## πŸ“¦ Data & Model Policy
105
+ - **Data and model files are NOT included in this repository.**
106
+ - Please download them from the provided Google Drive links above.
107
+
108
+
109
+ ## 🀝 Contributing
110
+ Pull requests and suggestions are welcome! For major changes, please open an issue first to discuss what you would like to change.
111
 
112
+ ---
113
+
114
+ ## πŸ“„ License
115
+ This project is licensed under the MIT License.
116
+
117
+ ---
118