Nechba commited on
Commit
2012a23
·
verified ·
1 Parent(s): 72eb62e

Update utils.py

Browse files
Files changed (1) hide show
  1. utils.py +16 -13
utils.py CHANGED
@@ -43,27 +43,30 @@ def process_local_pdf(pdf_bytes: bytes):
43
  api_key: Your Google AI Studio API key
44
  """
45
  # Configure Gemini
46
- prompt ="""Please analyze the provided images of the real estate document set and perform the following actions:
47
 
48
- 1. Identify Parties: Determine and list all present parties involved in the transaction. Include Seller 1, Seller 2 (only if mentioned), Buyer 1, and Buyer 2 (only if mentioned). Omit any party that is not clearly identified in the documents.
49
 
50
- 2. Identify Missing Items: Locate and list all instances of missing signatures and missing initials for each identified party across all documents.
51
 
52
- 3. Identify Checked Boxes: Locate and list all checkboxes that have been marked or checked.
53
 
54
- 4. Generate Secondary Questions: For checkboxes that indicate significant waivers (e.g., home warranty, inspection rights, lead paint assessment), specific conditions (e.g., cash sale, contingency status), potential conflicts, or reference other documents, formulate a relevant 'Secondary Question' designed to prompt confirmation or clarification from the user/parties involved.
55
 
56
- 5. Check for Required Paperwork: Based only on the checkboxes identified in step 3 that explicitly state or strongly imply a specific addendum or disclosure document should be attached (e.g., "Lead Based Paint Disclosure Addendum attached", "See Counter Offer Addendum", "Seller's Disclosure...Addendum attached", "Retainer Addendum attached", etc.), check if a document matching that description appears to be present within the provided image set. Note whether this implied paperwork is 'Found', 'Missing', or 'Potentially Missing/Ambiguous'.
57
 
58
- 6. Identify Conflicts: Specifically look for and note any directly contradictory information or conflicting checked boxes (like the conflicting inspection clauses found previously).
59
 
60
- 7. Provide Location: For every identified item (missing signature/initial, checked box, required paperwork status, party identification, conflict), specify the approximate line number(s) or clear location on the page (e.g., Bottom Right Initials, Seller Signature Block).
61
 
62
- 8. Format Output: Present all findings comprehensively in CSV format. The CSV columns should be:
63
- * Category (e.g., Parties, Missing Item, Checked Box, Required Paperwork, Conflict)
64
- * Image number (just make this number {})
65
- * Item Type (e.g., Seller Initials, Home Warranty Waiver, Lead Paint Addendum Check, Lead Paint Addendum Document)
66
- * Status (e.g., Identified, Missing, Checked, Found, Potentially Missing, Conflict)
 
 
 
67
  """
68
 
69
  # Convert to images
 
43
  api_key: Your Google AI Studio API key
44
  """
45
  # Configure Gemini
46
+ prompt = """Please analyze the provided images of the real estate document set and perform the following actions:
47
 
48
+ 1. **Identify Parties**: Determine and list all present parties involved in the transaction. Always identify and include **Seller 1** and **Buyer 1** if they are present in the documents. Additionally, include **Seller 2** and **Buyer 2** only if they are explicitly mentioned.
49
 
50
+ 2. **Identify Missing Items**: For each identified party, including at minimum **Seller 1** and **Buyer 1**, check all pages for any missing signatures or initials. Only check for **Seller 2** or **Buyer 2** if they were identified in step 1.
51
 
52
+ 3. **Identify Checked Boxes**: Locate and list all checkboxes that have been marked or checked.
53
 
54
+ 4. **Generate Secondary Questions**: For checkboxes that indicate significant waivers (e.g., home warranty, inspection rights, lead paint assessment), specific conditions (e.g., cash sale, contingency status), potential conflicts, or reference other documents, formulate a relevant 'Secondary Question' designed to prompt confirmation or clarification from the user/parties involved.
55
 
56
+ 5. **Check for Required Paperwork**: Based only on the checkboxes identified in step 3 that explicitly state or strongly imply a specific addendum or disclosure document should be attached (e.g., "Lead Based Paint Disclosure Addendum attached", "See Counter Offer Addendum", "Seller's Disclosure...Addendum attached", "Retainer Addendum attached", etc.), check if a document matching that description appears to be present within the provided image set. Note whether this implied paperwork is 'Found', 'Missing', or 'Potentially Missing/Ambiguous'.
57
 
58
+ 6. **Identify Conflicts**: Specifically look for and note any directly contradictory information or conflicting checked boxes (like the conflicting inspection clauses found previously).
59
 
60
+ 7. **Provide Location**: For every identified item (missing signature/initial, checked box, required paperwork status, party identification, conflict), specify the approximate line number(s) or clear location on the page (e.g., Bottom Right Initials, Seller Signature Block).
61
 
62
+ 8. **Format Output**: Present all findings in CSV format with the following columns:
63
+ - **Category**: (e.g., Parties, Missing Item, Checked Box, Required Paperwork, Conflict)
64
+ - **Location**: (e.g., Sale Contract (Image 8 Pg 1))
65
+ - **Line Item(s)**: (e.g., 4)
66
+ - **Item Type**: (e.g., Seller 1, Buyer 1, Seller Signature, Seller Initials)
67
+ - **Status**: (e.g., Identified, Missing, Checked, Found, Potentially Missing, Conflict)
68
+ - **Details**: (e.g., "Seller signature line (top line) is empty.", "Two initial boxes for Seller (approx line 106-107 area) are empty.")
69
+ - **Secondary Question** (if applicable): (e.g., "Is the Buyer aware they are waiving the home warranty?", "Has the Buyer received and reviewed the Seller's Disclosure?")
70
  """
71
 
72
  # Convert to images