demo / README.md
qq1023's picture
Update README.md
959a87e

A newer version of the Gradio SDK is available: 5.5.0

Upgrade
metadata
title: Automatic Reimbursement Tool Demo
emoji: 
colorFrom: red
colorTo: yellow
sdk: gradio
sdk_version: 3.39.0
app_file: app.py
pinned: false
python_version: 3.9.17

Automatic Reimbursment Tool

The aim of this tool is to automate the information extraction process involved in reimbursement filing. It leverages language models like ChatGPT for categorization and extraction of relevant details from uploaded receipts.

Features

  • Text Extraction - The tool extracts text from uploaded receipts.
  • Categorization - It categorizes the extracted text to identify relevant details.
  • Chatbot Interaction - Interacts with a language model to extract specific information based on the category.
  • Excel Output - Generates an Excel file containing extracted details.

Usage

1. Input: Upload receipt images or PDFs.
2. Processing: The tool automatically extracts text and categorizes it.
3. Output: View categorized information and interact with a chatbot for detailed extraction.
4. Download: Obtain an Excel file with extracted details.

Code

App.py

  • Libraries and Modules

      base64, os, re, io, pathlib: Standard Python libraries for various functionalities.
      gradio, pandas, json, PIL, openpyxl: External libraries used for UI, data handling, and image/PDF processing.
      categories, main: Custom modules handling categorization and main processing logic.
    
  • Global Variables

      HF_TOKEN
    
  • Functions

      display_file: Handles displaying uploaded files.
      show_intermediate_outputs: Controls visibility of intermediate outputs.
      show_share_contact: Manages visibility of sharing results and contact information.
      clear_inputs, clear_outputs: Functions for clearing inputs and outputs.
      extract_text, categorize_text: Functions for text extraction and categorization.
      query, parse: Functions for interacting with chatbots and parsing responses.
      activate_flags, deactivate_flags: Functions for flag activation/deactivation.
      flag_if_shared: Handles flagging if sharing is enabled.
      save_df_to_excel_with_autowidth, process_and_output_files: Functions for processing and outputting files.
    
  • UI Building Blocks

      Gradio's Blocks, Markdown, HTML, File, Buttons, Textbox, Dropdown, Accordion, Chatbot, JSON, and other components are used to create the UI for the tool.
    

Categories

The extracted text is categorized into accomodation, travel_flight, travel_cab, vendor or random depending on the type of bill upon which the relevant attributes needed from said category of bill are returned in JSON format.

  • Global Variables

      OPENAI_API_KEY
    

Dependencies

Ensure you have the following dependencies installed:

  • gradio
  • pandas
  • PIL
  • openpyxl
  • Other dependencies specified in 'requirements.txt'