Trisandhya commited on
Commit
ce2c75c
Β·
1 Parent(s): 6fd5cfa

update readme

Browse files
Files changed (2) hide show
  1. README.md +69 -0
  2. app.py +9 -0
README.md CHANGED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # OCR Web Application
2
+
3
+ ## Project Overview
4
+ This is a **web-based Optical Character Recognition (OCR) application** built using Streamlit. The app supports both English and Hindi languages, allowing users to upload images and extract text using advanced OCR models.
5
+
6
+ ## How the Application Works
7
+ 1. Choose Language: Select either English or Hindi using the sidebar instructions.
8
+ 2. Upload Image: Use the file uploader to input an image in JPG, PNG, or JPEG format.
9
+ 3. Text Extraction: For English, the app uses the GOT OCR 2.0 model to extract text, while for Hindi, it leverages EasyOCR.
10
+ 4. Keyword Search: After text extraction, you can search for specific keywords within the extracted text. Matching keywords will be highlighted, and any missing keywords will be displayed in a warning message.
11
+ 5. Reset: If needed, reset the session and upload a new image to start over.
12
+
13
+ ## Installation and Setup
14
+
15
+ ### Prerequisites:
16
+ - **Python 3.8 or higher**
17
+ - Required libraries listed in `requirements.txt`
18
+
19
+ ### Installation Steps:
20
+ 1. **Clone the repository**:
21
+ ```bash
22
+ git clone https://github.com/Trisandhyadevi/OCR.git
23
+
24
+ 2. **Navigate to the project directory**
25
+ ```bash
26
+ cd OCR
27
+
28
+ 3. **Install the required dependencies:**
29
+ ```bash
30
+ pip install -r requirements.txt
31
+
32
+ 4. **Run the application:**
33
+ ```bash
34
+ streamlit run app.py
35
+
36
+
37
+ # Description
38
+
39
+ This web application supports converting images to text using the GOT OCR 2.0 Model. Below are some key features of the GOT OCR 2.0 model
40
+
41
+ # GOT OCR 2.0 Model Overview
42
+
43
+ The GOT OCR 2.0 Model is a state-of-the-art OCR system designed for accurate text extraction from images. Key features include:
44
+
45
+ - **Multi-task Learning**: The model supports various tasks beyond OCR, including layout analysis and object detection, making it versatile for diverse text recognition needs.
46
+ - **End-to-End Pipeline**: It efficiently processes entire images, identifying and extracting text without the need for additional preprocessing steps.
47
+
48
+ Note: Currently, the model does not support all languages. Fine-tuning is required for languages not included in the pre-trained model. For more information on fine-tuning, visit the [GOT OCR 2.0 Fine-tuning Guide](https://github.com/Ucas-HaoranWei/GOT-OCR2.0/?tab=readme-ov-file#fine-tune).
49
+
50
+ For more technical details about the model architecture and usage, visit the [GOT OCR 2.0 Model Documentation](https://github.com/Ucas-HaoranWei/GOT-OCR2.0/?tab=readme-ov-file#general-ocr-theory-towards-ocr-20-via-a-unified-end-to-end-model).
51
+
52
+
53
+ ## Deployment
54
+ To deploy the application to a cloud platform(Hugging Face)
55
+
56
+ ## Folder Structure
57
+ 1.```bash
58
+ .
59
+ β”œβ”€β”€ app.py # Main application file
60
+ β”œβ”€β”€ requirements.txt # Python dependencies
61
+ └── README.md # Projectdocumentation
62
+
63
+
64
+ ## Dependencies
65
+ 1. Streamlit: Web framework to create the interactive interface.
66
+ 2. Transformers: To load the GOT OCR 2.0 model.
67
+ 3. EasyOCR: To process Hindi text extraction.
68
+ 4. Torchvision: To handle image transformations.
69
+ 5. Pillow: Image processing library.
app.py CHANGED
@@ -75,6 +75,15 @@ if 'reset' not in st.session_state:
75
 
76
  if 'language' not in st.session_state:
77
  st.session_state.language = False
 
 
 
 
 
 
 
 
 
78
 
79
  st.header("Optical Character Recognition ")
80
  col1, col2 = st.columns(2)
 
75
 
76
  if 'language' not in st.session_state:
77
  st.session_state.language = False
78
+
79
+ with st.sidebar:
80
+ st.header("Instructions")
81
+ st.write("1. Choose a language (English or Hindi).")
82
+ st.write("2. Upload an image in JPG, PNG, or JPEG format.")
83
+ st.write("3. The app will extract text from the image using OCR.")
84
+ st.write("4. Enter keywords to search within the extracted text.")
85
+ st.write("5. If needed, click 'Reset' to upload a new image.")
86
+
87
 
88
  st.header("Optical Character Recognition ")
89
  col1, col2 = st.columns(2)