File size: 4,272 Bytes
20f80cf
 
 
 
 
 
 
 
 
279da4f
20f80cf
279da4f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20f80cf
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
---
title: Dish Decode 2
emoji: πŸƒ
colorFrom: purple
colorTo: pink
sdk: docker
pinned: false
license: mit
short_description: Structure recipe information from videos

---
# 🍽️ Recipe Extraction API

This project is a Flask-based API that extracts structured recipe information from cooking tutorial videos! It uses the **Deepgram API** for audio transcription, **Tesseract OCR** for text extraction from video frames, and the **Gemini API** to generate a well-structured recipe document. πŸš€

---

## πŸ“¦ Project Setup

Follow these steps to set up and run the project on your local machine.

### 1️⃣ Clone the Repository

```bash
git clone <your-repo-url>
cd <your-repo-folder>
```

### 2️⃣ Install Dependencies

Make sure you have Python installed (Python 3.8 or above is recommended). Install the required libraries using pip:

```bash
pip install -r requirements.txt
```

### 3️⃣ Install Tesseract OCR

Ensure **Tesseract OCR** is installed on your system. You can download it here: [Tesseract GitHub](https://github.com/tesseract-ocr/tesseract)

Add Tesseract to your system path and make sure to note its installation location.

#### On Windows:

Add the path to `tesseract.exe` to your environment variables, e.g.:

```bash
C:\Program Files\Tesseract-OCR
```

#### On MacOS (using Homebrew):

```bash
brew install tesseract
```

#### On Ubuntu:

```bash
sudo apt-get install tesseract-ocr
```

### 4️⃣ Setup Environment Variables

Create a `.env` file in the root directory and add your API keys:

```plaintext
FIRST_API_KEY=<Your Gemini API Key>
SECOND_API_KEY=<Your Deepgram API Key>
```

### 5️⃣ Install FFmpeg

This project uses **FFmpeg** for converting MP4 videos to WAV audio. Install it via the following:

#### On MacOS (using Homebrew):

```bash
brew install ffmpeg
```

#### On Ubuntu:

```bash
sudo apt-get install ffmpeg
```

#### On Windows:

Download FFmpeg from [FFmpeg.org](https://ffmpeg.org/download.html) and add it to your system path.

---

## πŸš€ Running the Project

Start the Flask server with the following command:

```bash
python app.py
```

If everything is set up correctly, you should see:

```plaintext
 * Running on http://127.0.0.1:5000/
```

---

## πŸ“‘ API Endpoints

### βœ… Health Check

**Endpoint:** `GET /`

Check if the API is running.

```bash
curl http://127.0.0.1:5000/
```

**Response:**

```json
{
    "status": "success",
    "message": "API is running successfully!"
}
```

### 🍲 Recipe Extraction

**Endpoint:** `POST /process-video`

#### Request Body:

Send a JSON payload with a video URL:

```json
{
    "videoUrl": "<URL-of-the-cooking-video>"
}
```

#### Example Using `curl`:

```bash
curl -X POST http://127.0.0.1:5000/process-video \
-H "Content-Type: application/json" \
-d '{"videoUrl": "https://example.com/video.mp4"}'
```

#### Sample Response:

```json
{
    "**1. Recipe Name:**": "Beef Wellington",
    "**2. Ingredients List:**": "* Fillet of beef\n* Olive oil\n* Salt\n* Pepper",
    "**3. Steps for Preparation:**": "1. Sear the beef fillet\n2. Brush with mustard",
    "**4. Cooking Techniques Used:**": "* Searing\n* Wrapping",
    "**5. Equipment Needed:**": "* Hot pan\n* Blender",
    "**6. Nutritional Information:**": "High in protein and fat",
    "**7. Serving size:**": "2-4 people",
    "**8. Special Notes or Variations:**": "Use horseradish instead of mustard",
    "**9. Festive or Thematic Relevance:**": "Christmas alternative to roast turkey"
}
```

---

## πŸ› οΈ Key Features

- **Deepgram API** for accurate audio transcription.
- **Tesseract OCR** for extracting text from video frames.
- **Gemini API** for generating structured recipe information.
- **FFmpeg** for seamless MP4-to-WAV conversion.
- Supports both audio and video analysis for enhanced accuracy. 🎯

---

## πŸ§ͺ Testing

Use tools like **Postman** or **curl** to test the API endpoints.

---

## 🀝 Contributions

Contributions are welcome! Feel free to submit a pull request or open an issue for any enhancements or bug fixes.

---

## πŸ“„ License

This project is licensed under the MIT License.

---

### 🌟 Happy Coding and Bon AppΓ©tit! πŸ‘¨β€πŸ³πŸ‘©β€πŸ³


Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference