File size: 3,455 Bytes
0d362b3
 
 
 
 
 
ae31f64
0d362b3
 
 
 
 
de52407
f44b9ab
de52407
 
f44b9ab
 
de52407
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f44b9ab
de52407
f44b9ab
de52407
 
 
f44b9ab
de52407
f44b9ab
de52407
f44b9ab
de52407
f44b9ab
de52407
 
 
f44b9ab
de52407
f44b9ab
de52407
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f44b9ab
 
 
de52407
 
 
 
 
 
 
 
 
 
f44b9ab
 
 
 
de52407
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
title: OSINT Tool
emoji: 🏢
colorFrom: gray
colorTo: indigo
sdk: streamlit
sdk_version: 1.42.2
app_file: app.py
pinned: false
license: mit
---

# OSINT Tool & GitHub Repository Analysis

## Overview
This project is designed to perform Open Source Intelligence (OSINT) analysis on GitHub repositories and fetch titles from URLs. It also provides functionalities to upload datasets in CSV format for fine-tuning machine learning models. Currently, it supports fine-tuning models like `distilbert-base-uncased` for sequence classification tasks.

## Features
- **GitHub Repository Analysis**: Analyze GitHub repositories by entering the repository owner and name.
- **URL Title Fetcher**: Fetch titles from given URLs.
- **Dataset Upload & Model Fine-Tuning**: Upload CSV files for fine-tuning models and perform sequence classification tasks.

## Prerequisites
Before running the project, make sure you have the following dependencies installed:

- Python 3.6 or higher
- PyTorch (for model fine-tuning)
- Hugging Face Transformers
- Other dependencies listed in `requirements.txt`

## Badges
![Build Status](https://img.shields.io/github/workflow/status/canstralian/osint-tool/CI)
![Code Coverage](https://img.shields.io/codecov/c/github/canstralian/osint-tool)
![License](https://img.shields.io/github/license/canstralian/osint-tool)

## Installation

### 1. Clone the repository

```bash
git clone https://github.com/<username>/<repository>.git
cd <repository>
```

### 2. Create and activate a virtual environment

For Linux/MacOS:

```bash
python3 -m venv venv
source venv/bin/activate
```

For Windows:

```bash
python -m venv venv
.\venv\Scripts\activate
```

### 3. Install dependencies

```bash
pip install -r requirements.txt
```

### 4. Install PyTorch (if not already installed)

```bash
pip install torch
```

## Usage

### Running the Application

To run the application, simply execute the following command:

```bash
python app.py
```

### Features

#### GitHub Repository Analysis

1. Enter the GitHub repository owner and name.
2. The application will fetch details and analyze the repository.

#### URL Title Fetcher

1. Enter a URL, and the application will fetch the title of the page.

#### Dataset Upload & Model Fine-Tuning

1. Upload a CSV file (limit 200MB).
2. Select the model for fine-tuning (e.g., `distilbert-base-uncased`).
3. Fine-tune the model for sequence classification tasks.

### Example CSV Format for Fine-Tuning

```csv
text,label
"This is an example sentence.",1
"This is another example.",0
```

### Running the Model Fine-Tuning

```bash
python fine_tune.py --model distilbert-base-uncased --data dataset.csv
```

## Contributing

Contributions are welcome! Please open an issue or submit a pull request with your suggestions or fixes.

### Steps to Contribute

1. Fork the repository.
2. Clone your fork: `git clone https://github.com/<your-username>/<repository>.git`
3. Create a branch: `git checkout -b feature/your-feature`
4. Commit your changes: `git commit -m "Add feature"`
5. Push to the branch: `git push origin feature/your-feature`
6. Open a pull request.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Acknowledgments

- [Hugging Face](https://huggingface.co/) for providing the pre-trained models.
- [PyTorch](https://pytorch.org/) for deep learning frameworks.
- [Shields.io](https://shields.io/) for providing awesome badges.