sohail-shaikh-s07
commited on
Update README.md
Browse filesupdated readme file
README.md
CHANGED
@@ -4,79 +4,100 @@ emoji: π°
|
|
4 |
colorFrom: blue
|
5 |
colorTo: indigo
|
6 |
sdk: gradio
|
7 |
-
sdk_version:
|
8 |
app_file: app.py
|
9 |
pinned: false
|
|
|
10 |
---
|
11 |
|
12 |
-
# News Article Summarizer
|
13 |
|
14 |
-
|
15 |
|
16 |
-
## Features
|
17 |
|
18 |
-
-
|
19 |
-
-
|
20 |
-
-
|
21 |
-
-
|
|
|
22 |
|
23 |
-
##
|
24 |
|
25 |
-
|
26 |
-
|
27 |
-
|
|
|
|
|
|
|
28 |
|
29 |
-
##
|
30 |
|
31 |
-
To run this app locally:
|
32 |
-
|
33 |
-
1. Install the requirements:
|
34 |
-
```bash
|
35 |
-
pip install -r requirements.txt
|
36 |
```
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
|
|
|
|
41 |
```
|
42 |
|
43 |
-
##
|
44 |
-
|
45 |
-
This app is ready to be deployed on Hugging Face Spaces.
|
46 |
|
|
|
|
|
|
|
|
|
47 |
|
|
|
|
|
|
|
|
|
48 |
|
49 |
-
|
|
|
|
|
|
|
|
|
50 |
|
51 |
-
|
52 |
|
53 |
-
|
|
|
|
|
|
|
54 |
|
55 |
-
|
56 |
-
-
|
57 |
-
-
|
58 |
-
- Handles long articles by splitting them into chunks
|
59 |
|
60 |
-
|
|
|
|
|
|
|
61 |
|
62 |
-
|
63 |
-
2. Click submit
|
64 |
-
3. Get your summarized article instantly
|
65 |
|
66 |
-
|
|
|
|
|
|
|
67 |
|
68 |
-
|
69 |
|
70 |
-
|
71 |
-
|
72 |
-
|
|
|
73 |
```
|
74 |
|
75 |
-
|
76 |
-
|
77 |
-
|
78 |
-
|
|
|
|
|
79 |
|
80 |
-
##
|
81 |
|
82 |
-
This
|
|
|
4 |
colorFrom: blue
|
5 |
colorTo: indigo
|
6 |
sdk: gradio
|
7 |
+
sdk_version: 4.0.0
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
+
license: mit
|
11 |
---
|
12 |
|
13 |
+
# π° News Article Summarizer
|
14 |
|
15 |
+
A powerful and efficient news article summarization tool powered by BART-Large-CNN model. This application automatically extracts and summarizes news articles from URLs, making it easier to quickly grasp the key points of any news article.
|
16 |
|
17 |
+
## π Features
|
18 |
|
19 |
+
- **Smart Article Extraction**: Automatically extracts article content from news URLs
|
20 |
+
- **Advanced Summarization**: Uses BART-Large-CNN model for high-quality summaries
|
21 |
+
- **Chunk Processing**: Handles long articles by processing them in chunks
|
22 |
+
- **Clean Output**: Removes unwanted elements like ads and navigation for better results
|
23 |
+
- **User-Friendly Interface**: Simple Gradio interface for easy interaction
|
24 |
|
25 |
+
## π οΈ Technology Stack
|
26 |
|
27 |
+
- **Python**: Core programming language
|
28 |
+
- **BART-Large-CNN**: State-of-the-art summarization model
|
29 |
+
- **Gradio**: Web interface framework
|
30 |
+
- **BeautifulSoup4**: HTML parsing and content extraction
|
31 |
+
- **PyTorch**: Deep learning framework
|
32 |
+
- **Transformers**: Hugging Face transformers library
|
33 |
|
34 |
+
## π Requirements
|
35 |
|
|
|
|
|
|
|
|
|
|
|
36 |
```
|
37 |
+
gradio==5.9.1
|
38 |
+
transformers
|
39 |
+
torch
|
40 |
+
beautifulsoup4
|
41 |
+
requests
|
42 |
+
nltk
|
43 |
```
|
44 |
|
45 |
+
## π Getting Started
|
|
|
|
|
46 |
|
47 |
+
1. **Install Dependencies**:
|
48 |
+
```bash
|
49 |
+
pip install -r requirements.txt
|
50 |
+
```
|
51 |
|
52 |
+
2. **Run the Application**:
|
53 |
+
```bash
|
54 |
+
python app.py
|
55 |
+
```
|
56 |
|
57 |
+
3. **Use the App**:
|
58 |
+
- Open the provided URL in your browser
|
59 |
+
- Paste a news article URL
|
60 |
+
- Wait for the summary (processing time depends on article length)
|
61 |
+
- Get your concise summary!
|
62 |
|
63 |
+
## π‘ How It Works
|
64 |
|
65 |
+
1. **Article Extraction**:
|
66 |
+
- Fetches article content from the provided URL
|
67 |
+
- Removes unwanted elements (ads, navigation, etc.)
|
68 |
+
- Extracts main article text
|
69 |
|
70 |
+
2. **Text Processing**:
|
71 |
+
- Splits long articles into manageable chunks (1024 tokens each)
|
72 |
+
- Cleans and prepares text for summarization
|
|
|
73 |
|
74 |
+
3. **Summarization**:
|
75 |
+
- Uses BART-Large-CNN model for each chunk
|
76 |
+
- Combines summaries for a coherent final output
|
77 |
+
- Provides clean, readable summaries
|
78 |
|
79 |
+
## β οΈ Notes
|
|
|
|
|
80 |
|
81 |
+
- Processing time varies based on article length
|
82 |
+
- Look for "Running..." indicator while processing
|
83 |
+
- Wait patiently for best results
|
84 |
+
- Model can be changed to T5 or GPT-2 for different results
|
85 |
|
86 |
+
## π Example Usage
|
87 |
|
88 |
+
```python
|
89 |
+
# Example URLs:
|
90 |
+
https://www.bbc.com/sport/football/articles/cvgxmzy86e4o
|
91 |
+
https://globalsouthworld.com/article/biden-approves-571-million-in-defense-support-for-taiwan
|
92 |
```
|
93 |
|
94 |
+
## π€ Contributing
|
95 |
+
|
96 |
+
Feel free to:
|
97 |
+
- Open issues
|
98 |
+
- Suggest improvements
|
99 |
+
- Submit pull requests
|
100 |
|
101 |
+
## π License
|
102 |
|
103 |
+
This project is open source and available under the MIT License.
|