Spaces:
Running
title: Dish Decode 2
emoji: π
colorFrom: purple
colorTo: pink
sdk: docker
pinned: false
license: mit
short_description: Structure recipe information from videos
π½οΈ Recipe Extraction API
This project is a Flask-based API that extracts structured recipe information from cooking tutorial videos! It uses the Deepgram API for audio transcription, Tesseract OCR for text extraction from video frames, and the Gemini API to generate a well-structured recipe document. π
π¦ Project Setup
Follow these steps to set up and run the project on your local machine.
1οΈβ£ Clone the Repository
git clone <your-repo-url>
cd <your-repo-folder>
2οΈβ£ Install Dependencies
Make sure you have Python installed (Python 3.8 or above is recommended). Install the required libraries using pip:
pip install -r requirements.txt
3οΈβ£ Install Tesseract OCR
Ensure Tesseract OCR is installed on your system. You can download it here: Tesseract GitHub
Add Tesseract to your system path and make sure to note its installation location.
On Windows:
Add the path to tesseract.exe
to your environment variables, e.g.:
C:\Program Files\Tesseract-OCR
On MacOS (using Homebrew):
brew install tesseract
On Ubuntu:
sudo apt-get install tesseract-ocr
4οΈβ£ Setup Environment Variables
Create a .env
file in the root directory and add your API keys:
FIRST_API_KEY=<Your Gemini API Key>
SECOND_API_KEY=<Your Deepgram API Key>
5οΈβ£ Install FFmpeg
This project uses FFmpeg for converting MP4 videos to WAV audio. Install it via the following:
On MacOS (using Homebrew):
brew install ffmpeg
On Ubuntu:
sudo apt-get install ffmpeg
On Windows:
Download FFmpeg from FFmpeg.org and add it to your system path.
π Running the Project
Start the Flask server with the following command:
python app.py
If everything is set up correctly, you should see:
* Running on http://127.0.0.1:5000/
π‘ API Endpoints
β Health Check
Endpoint: GET /
Check if the API is running.
curl http://127.0.0.1:5000/
Response:
{
"status": "success",
"message": "API is running successfully!"
}
π² Recipe Extraction
Endpoint: POST /process-video
Request Body:
Send a JSON payload with a video URL:
{
"videoUrl": "<URL-of-the-cooking-video>"
}
Example Using curl
:
curl -X POST http://127.0.0.1:5000/process-video \
-H "Content-Type: application/json" \
-d '{"videoUrl": "https://example.com/video.mp4"}'
Sample Response:
{
"**1. Recipe Name:**": "Beef Wellington",
"**2. Ingredients List:**": "* Fillet of beef\n* Olive oil\n* Salt\n* Pepper",
"**3. Steps for Preparation:**": "1. Sear the beef fillet\n2. Brush with mustard",
"**4. Cooking Techniques Used:**": "* Searing\n* Wrapping",
"**5. Equipment Needed:**": "* Hot pan\n* Blender",
"**6. Nutritional Information:**": "High in protein and fat",
"**7. Serving size:**": "2-4 people",
"**8. Special Notes or Variations:**": "Use horseradish instead of mustard",
"**9. Festive or Thematic Relevance:**": "Christmas alternative to roast turkey"
}
π οΈ Key Features
- Deepgram API for accurate audio transcription.
- Tesseract OCR for extracting text from video frames.
- Gemini API for generating structured recipe information.
- FFmpeg for seamless MP4-to-WAV conversion.
- Supports both audio and video analysis for enhanced accuracy. π―
π§ͺ Testing
Use tools like Postman or curl to test the API endpoints.
π€ Contributions
Contributions are welcome! Feel free to submit a pull request or open an issue for any enhancements or bug fixes.
π License
This project is licensed under the MIT License.
π Happy Coding and Bon AppΓ©tit! π¨βπ³π©βπ³
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference