File size: 4,817 Bytes
276c077
a725b51
276c077
 
 
 
3e58c9a
276c077
 
a725b51
a3e35f0
276c077
 
3e58c9a
6a67a47
3e58c9a
6a67a47
3e58c9a
 
 
 
 
 
 
 
6a67a47
3e58c9a
 
6a67a47
3e58c9a
6a67a47
3e58c9a
 
 
 
 
 
 
6a67a47
3e58c9a
 
6a67a47
 
3e58c9a
 
 
6a67a47
3e58c9a
6a67a47
 
 
3e58c9a
b56c065
6a67a47
 
 
 
 
 
 
 
 
3e58c9a
6a67a47
3e58c9a
6a67a47
3e58c9a
6a67a47
 
 
 
 
3e58c9a
6a67a47
 
3e58c9a
 
6a67a47
3e58c9a
 
6a67a47
3e58c9a
6a67a47
3e58c9a
6a67a47
 
 
 
 
 
 
 
 
 
4078b0d
6a67a47
 
 
 
 
 
 
 
 
 
 
 
3e58c9a
6a67a47
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
---
title: Chat With Pdf
emoji: ๐Ÿ’ฌ
colorFrom: yellow
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
short_description: 'AI-powered system to interact with Arabic-language PDF doc '
---

# **Arabic PDF Chat Assistant ๐Ÿค–๐Ÿ“„**

**Arabic PDF Chat Assistant** is an AI-powered system that integrates **Retrieval-Augmented Generation (RAG)** to interact with Arabic-language PDF documents. Using **Gemma**, an advanced model from Google, the system allows users to upload PDF files and ask questions in Arabic. The assistant generates precise answers based on relevant document content, making it ideal for professionals, researchers, and students dealing with Arabic texts.

## Table of Contents ๐Ÿ“‘
- [Overview](#overview) ๐ŸŒ
- [Features](#features) โœจ
- [Model](#model) ๐Ÿง 
- [Installation](#installation) ๐Ÿ”ง
- [Usage](#usage) ๐Ÿš€
- [Gradio Interface](#gradio-interface) ๐Ÿ’ป
- [License](#license) ๐Ÿ“œ

## Overview ๐ŸŒ
The **Arabic PDF Chat Assistant** leverages **RAG (Retrieval-Augmented Generation)** technology to provide accurate answers by combining information retrieval and language generation capabilities. It is designed to analyze Arabic-language PDF documents and extract relevant content for natural language queries in Arabic.

The system is ideal for extracting key information from large, complex Arabic PDF documents and presenting it in a concise, professional manner.

## Features โœจ
- **Arabic Language Support** ๐Ÿ“: Fully supports queries and responses in Modern Standard Arabic (ุงู„ูุตุญู‰).
- **Retrieval-Augmented Generation (RAG)** ๐Ÿ“š: Enhances the assistant's ability to fetch information from the document and generate answers based on the retrieved content.
- **PDF Querying** ๐Ÿ“„: Upload any Arabic PDF document and ask questions in Arabic. The assistant will retrieve relevant information from the document and generate a clear response.
- **Gemma-powered Responses** ๐Ÿง : Utilizes **Gemma** from Google, an advanced model tailored for Arabic-language understanding and question-answering.
- **Audio Responses** ๐Ÿ”Š: The assistant can generate audio responses in Arabic using Text-to-Speech (TTS) technology.
- **Interactive Gradio Interface** ๐Ÿ’ป: A simple and intuitive web interface for chatting, uploading PDFs, and listening to audio responses.

## Model ๐Ÿง 
The **Arabic PDF Chat Assistant** uses a RAG architecture powered by **Gemma**, a state-of-the-art language model from Google, specialized in Arabic question-answering. The model integrates information retrieval with language generation, allowing it to respond effectively based on the contents of PDF documents.

### Model Details
- **Base Model**: Gemma (Googleโ€™s advanced model for Arabic understanding).
- **Architecture**: **Retrieval-Augmented Generation (RAG)**, which combines a retriever to find relevant parts of the document and a generator to create well-formed responses.
- **Training Dataset**: Fine-tuned using Arabic-language datasets for document comprehension and question-answering.

## Installation ๐Ÿ”ง

### Clone the repository
```bash
git clone https://github.com/your-repository/Arabic-PDF-Chat.git
cd Arabic-PDF-Chat
```

### Install dependencies
Make sure to have Python installed, and then run:
```bash
pip install -r requirements.txt
```

### API Keys (Optional)
If you're using any external services (e.g., Hugging Face Transformers, Google Gemma API), ensure you have the necessary API keys.

## Usage ๐Ÿš€

1. **Launch the application**: 
   Run the following command to launch the app in your browser:
   ```bash
   python app.py
   ```

2. **Upload PDF**: 
   Upload your Arabic-language PDF document through the Gradio interface.

3. **Ask Questions**: 
   Enter your question in Arabic. The RAG system will retrieve relevant information from the PDF and generate a response based on the content.

4. **Listen to Audio Responses (Optional)**: 
   If enabled, you can listen to the assistantโ€™s response via audio generated using Text-to-Speech (TTS).

## Gradio Interface ๐Ÿ’ป

The user interface is built using **Gradio**, making it accessible and easy to use. Here's an example of the setup:

```python
import gradio as gr

def chat(user_input):
    # Chat function logic here
    pass

with gr.Blocks() as demo:
    with gr.Row():
        gr.Image("logo.png")  
    with gr.Row():
        pdf_input = gr.File(label="ุงุฎุชุฑ ู…ู„ู PDF")
    with gr.Row():
        chat_input = gr.Textbox(placeholder="ุฃุฏุฎู„ ุณุคุงู„ูƒ ู‡ู†ุง ..")
    with gr.Row():
        submit_button_chat = gr.Button("ุฅุฑุณุงู„")
    
    submit_button_chat.click(chat, inputs=chat_input, outputs=chatbot)

demo.launch()
```

## License ๐Ÿ“œ
This project is licensed under the MIT License. See the `LICENSE` file for more details.