# Automatic Reimbursment Tool The aim of this tool is to automate the information extraction process involved in reimbursement filing. It leverages language models like ChatGPT for categorization and extraction of relevant details from uploaded receipts. ## Features - Text Extraction - The tool extracts text from uploaded receipts. - Categorization - It categorizes the extracted text to identify relevant details. - Chatbot Interaction - Interacts with a language model to extract specific information based on the category. - Excel Output - Generates an Excel file containing extracted details. ## Usage 1. Input: Upload receipt images or PDFs. 2. Processing: The tool automatically extracts text and categorizes it. 3. Output: View categorized information and interact with a chatbot for detailed extraction. 4. Download: Obtain an Excel file with extracted details. ## Code ### App.py - Libraries and Modules base64, os, re, io, pathlib: Standard Python libraries for various functionalities. gradio, pandas, json, PIL, openpyxl: External libraries used for UI, data handling, and image/PDF processing. categories, main: Custom modules handling categorization and main processing logic. - Global Variables HF_TOKEN - Functions display_file: Handles displaying uploaded files. show_intermediate_outputs: Controls visibility of intermediate outputs. show_share_contact: Manages visibility of sharing results and contact information. clear_inputs, clear_outputs: Functions for clearing inputs and outputs. extract_text, categorize_text: Functions for text extraction and categorization. query, parse: Functions for interacting with chatbots and parsing responses. activate_flags, deactivate_flags: Functions for flag activation/deactivation. flag_if_shared: Handles flagging if sharing is enabled. save_df_to_excel_with_autowidth, process_and_output_files: Functions for processing and outputting files. - UI Building Blocks Gradio's Blocks, Markdown, HTML, File, Buttons, Textbox, Dropdown, Accordion, Chatbot, JSON, and other components are used to create the UI for the tool. ### Categories The extracted text is categorized into accomodation, travel_flight, travel_cab, vendor or random depending on the type of bill upon which the relevant attributes needed from said category of bill are returned in JSON format. * Global Variables OPENAI_API_KEY ## Dependencies Ensure you have the following dependencies installed: * gradio * pandas * PIL * openpyxl * Other dependencies specified in 'requirements.txt'