lukmanaj commited on
Commit
e9e2b45
Β·
verified Β·
1 Parent(s): bffa0c1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +136 -1
README.md CHANGED
@@ -9,4 +9,139 @@ app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  pinned: false
10
  ---
11
 
12
+
13
+ # πŸ“š Mistral RAG Chat - Document Question Answering
14
+
15
+ **Chat with your documents using Mistral AI's powerful language models!**
16
+
17
+ Upload any text document and ask questions about its content. This app uses Retrieval-Augmented Generation (RAG) to provide accurate, context-aware answers based on your uploaded documents.
18
+
19
+ ## πŸš€ Features
20
+
21
+ - **πŸ“„ Document Upload**: Support for `.txt` files
22
+ - **πŸ” Smart Retrieval**: Uses FAISS vector search to find relevant content
23
+ - **πŸ€– Mistral AI**: Powered by Mistral's large language model
24
+ - **πŸ’¬ Chat Interface**: Intuitive conversation-style interaction
25
+ - **⚑ Fast Processing**: Efficient document chunking and embedding
26
+
27
+ ## πŸ› οΈ How It Works
28
+
29
+ 1. **Upload** your text document (.txt format)
30
+ 2. **Process** the document (creates searchable embeddings)
31
+ 3. **Ask** questions about the content
32
+ 4. **Get** accurate answers based on the document context
33
+
34
+ ## πŸ’‘ Use Cases
35
+
36
+ - **πŸ“– Research Papers**: Ask questions about academic papers
37
+ - **πŸ“‹ Company Documents**: Query policy manuals, reports, handbooks
38
+ - **πŸ“š Educational Content**: Study materials, textbooks, lecture notes
39
+ - **πŸ“° News Articles**: Analyze and understand news content
40
+ - **πŸ“„ Legal Documents**: Extract key information from contracts, agreements
41
+
42
+ ## 🎯 Example Queries
43
+
44
+ After uploading a document, try asking:
45
+ - "What is the main topic of this document?"
46
+ - "Summarize the key points"
47
+ - "What does the author say about [specific topic]?"
48
+ - "Are there any statistics or numbers mentioned?"
49
+ - "What conclusions does the document reach?"
50
+
51
+ ## πŸ”§ Technical Details
52
+
53
+ - **Embedding Model**: `mistral-embed` for document vectorization
54
+ - **LLM**: `mistral-large-latest` for answer generation
55
+ - **Vector Database**: FAISS for similarity search
56
+ - **Chunk Size**: 2048 characters for optimal context
57
+ - **Retrieval**: Top-2 most relevant chunks per query
58
+
59
+ ## πŸ“ Supported Formats
60
+
61
+ Currently supports:
62
+ - `.txt` files (UTF-8 encoded)
63
+
64
+ *More formats coming soon!*
65
+
66
+ ## 🚦 Getting Started
67
+
68
+ 1. Click **"Upload Text File"** and select your document
69
+ 2. Click **"Process Document"** and wait for confirmation
70
+ 3. Start asking questions in the chat interface
71
+ 4. Get instant, context-aware answers!
72
+
73
+ ## ⚠️ Important Notes
74
+
75
+ - **File Size**: Keep documents under 10MB for best performance
76
+ - **Language**: Works best with English text
77
+ - **Context**: The AI only knows what's in your uploaded document
78
+ - **Privacy**: Documents are processed temporarily and not stored permanently
79
+
80
+ ## πŸ” Privacy & Security
81
+
82
+ - Your documents are processed in real-time
83
+ - No permanent storage of uploaded files
84
+ - Conversations are not logged or saved
85
+ - API calls are made securely to Mistral AI
86
+
87
+ ## πŸ†˜ Troubleshooting
88
+
89
+ **Document won't process?**
90
+ - Ensure your file is in `.txt` format
91
+ - Check that the file contains readable text
92
+ - Try a smaller file if you're having issues
93
+
94
+ **Getting irrelevant answers?**
95
+ - Make sure your question relates to the document content
96
+ - Try rephrasing your question more specifically
97
+ - Check that the document was processed successfully
98
+
99
+ **Error messages?**
100
+ - Refresh the page and try again
101
+ - Ensure your document is properly formatted
102
+ - Contact support if issues persist
103
+
104
+ ## πŸš€ Built With
105
+
106
+ - **[Gradio](https://gradio.app/)**: Web interface framework
107
+ - **[Mistral AI](https://mistral.ai/)**: Language model and embeddings
108
+ - **[FAISS](https://faiss.ai/)**: Vector similarity search
109
+ - **[NumPy](https://numpy.org/)**: Numerical computing
110
+ - **[Hugging Face Spaces](https://huggingface.co/spaces)**: Hosting platform
111
+
112
+ ## πŸ“Š Model Information
113
+
114
+ - **Base Model**: Mistral Large (latest version)
115
+ - **Embedding Dimension**: 1024
116
+ - **Context Window**: 32k tokens
117
+ - **Languages**: Optimized for English, supports multiple languages
118
+
119
+ ## 🎨 Interface Preview
120
+
121
+ ```
122
+ πŸ“š RAG Chat Interface
123
+ Upload a text file and chat with its content!
124
+
125
+ [Upload Text File] [Process Document]
126
+
127
+ Processing Status: Document processed successfully! Split into 15 chunks.
128
+
129
+ πŸ’¬ Chat:
130
+ User: What is this document about?
131
+ Assistant: Based on the uploaded document, this appears to be about...
132
+
133
+ [Your Message: Ask questions about the uploaded document...]
134
+ [Send] [Clear Chat]
135
+ ```
136
+
137
+
138
+ ## πŸ“„ License
139
+
140
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
141
+
142
+
143
+ ---
144
+
145
+ **Made with ❀️ using Mistral AI and Gradio**
146
+
147
+ *Try it now - upload a document and start chatting!*