Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
5.5.0
metadata
title: Automatic Reimbursement Tool Demo
emoji: ⚡
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 3.39.0
app_file: app.py
pinned: false
python_version: 3.9.17
Automatic Reimbursment Tool
The aim of this tool is to automate the information extraction process involved in reimbursement filing. It leverages language models like ChatGPT for categorization and extraction of relevant details from uploaded receipts.
Features
- Text Extraction - The tool extracts text from uploaded receipts.
- Categorization - It categorizes the extracted text to identify relevant details.
- Chatbot Interaction - Interacts with a language model to extract specific information based on the category.
- Excel Output - Generates an Excel file containing extracted details.
Usage
1. Input: Upload receipt images or PDFs.
2. Processing: The tool automatically extracts text and categorizes it.
3. Output: View categorized information and interact with a chatbot for detailed extraction.
4. Download: Obtain an Excel file with extracted details.
Code
App.py
Libraries and Modules
base64, os, re, io, pathlib: Standard Python libraries for various functionalities. gradio, pandas, json, PIL, openpyxl: External libraries used for UI, data handling, and image/PDF processing. categories, main: Custom modules handling categorization and main processing logic.
Global Variables
HF_TOKEN
Functions
display_file: Handles displaying uploaded files. show_intermediate_outputs: Controls visibility of intermediate outputs. show_share_contact: Manages visibility of sharing results and contact information. clear_inputs, clear_outputs: Functions for clearing inputs and outputs. extract_text, categorize_text: Functions for text extraction and categorization. query, parse: Functions for interacting with chatbots and parsing responses. activate_flags, deactivate_flags: Functions for flag activation/deactivation. flag_if_shared: Handles flagging if sharing is enabled. save_df_to_excel_with_autowidth, process_and_output_files: Functions for processing and outputting files.
UI Building Blocks
Gradio's Blocks, Markdown, HTML, File, Buttons, Textbox, Dropdown, Accordion, Chatbot, JSON, and other components are used to create the UI for the tool.
Categories
The extracted text is categorized into accomodation, travel_flight, travel_cab, vendor or random depending on the type of bill upon which the relevant attributes needed from said category of bill are returned in JSON format.
Global Variables
OPENAI_API_KEY
Dependencies
Ensure you have the following dependencies installed:
- gradio
- pandas
- PIL
- openpyxl
- Other dependencies specified in 'requirements.txt'