Spaces:

Sompote
/

soil_profile

Sleeping

App Files Files Community

Sompote commited on Jun 26

Commit

2c200f8

verified ·

1 Parent(s): 8a11eb7

Upload 17 files

Browse files

Files changed (17) hide show

Dockerfile +23 -13
README.md +98 -19
app.py +62 -0
config.py +216 -0
crewai_agents.py +553 -0
document_processor.py +111 -0
langgraph_agent.py +214 -0
llm_client.py +588 -0
nearest_neighbor_grouping.py +297 -0
packages.txt +4 -0
requirements.txt +19 -3
soil_analyzer.py +305 -0
soil_boring_analyzer_hf_ready.zip +3 -0
soil_calculations.py +350 -0
soil_classification.py +1434 -0
soil_visualizer.py +285 -0
unified_soil_workflow.py +1287 -0

Dockerfile CHANGED Viewed

@@ -1,21 +1,31 @@
-FROM python:3.9-slim
-WORKDIR /app
 RUN apt-get update && apt-get install -y \
-    build-essential \
-    curl \
-    software-properties-common \
-    git \
     && rm -rf /var/lib/apt/lists/*
-COPY requirements.txt ./
-COPY src/ ./src/
-RUN pip3 install -r requirements.txt
-EXPOSE 8501
-HEALTHCHECK CMD curl --fail http://localhost:8501/_stcore/health
-ENTRYPOINT ["streamlit", "run", "src/streamlit_app.py", "--server.port=8501", "--server.address=0.0.0.0"]

+FROM python:3.11-slim
+# Install system dependencies
 RUN apt-get update && apt-get install -y \
+    poppler-utils \
+    tesseract-ocr \
+    libgl1-mesa-glx \
+    libglib2.0-0 \
     && rm -rf /var/lib/apt/lists/*
+# Set working directory
+WORKDIR /app
+# Copy requirements first for better caching
+COPY requirements_hf.txt .
+RUN pip install --no-cache-dir -r requirements_hf.txt
+# Copy application files
+COPY . .
+# Create environment template
+RUN if [ ! -f .env ]; then cp .env_template .env; fi
+# Expose Streamlit port
+EXPOSE 7860
+# Health check
+HEALTHCHECK CMD curl --fail http://localhost:7860/_stcore/health
+# Run the application
+CMD ["streamlit", "run", "app_hf.py", "--server.port=7860", "--server.address=0.0.0.0"]

README.md CHANGED Viewed

@@ -1,19 +1,98 @@
----
-title: Soil Profile
-emoji: 🚀
-colorFrom: red
-colorTo: red
-sdk: docker
-app_port: 8501
-tags:
-- streamlit
-pinned: false
-short_description: soil profile analysis
----
-# Welcome to Streamlit!
-Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
-If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
-forums](https://discuss.streamlit.io).

+# 🏗️ Soil Boring Log Analyzer
+An AI-powered application for analyzing soil boring logs using multiple LLM providers. Upload PDF or image files of soil boring logs to automatically extract and analyze soil layers with professional geotechnical insights.
+## ✨ Features
+- **Multi-LLM Support**: Choose from OpenRouter, Anthropic Claude, or Google Gemini
+- **Document Processing**: Upload PDF or image files of soil boring logs
+- **AI Analysis**: Three analysis methods (CrewAI, LangGraph, Unified Workflow)
+- **Soil Classification**: Automatic soil type classification with strength parameters
+- **Interactive Visualizations**: Professional soil profile charts with units
+- **Layer Processing**: Smart layer merging and splitting capabilities
+## 🚀 Quick Start
+1. **Configure LLM Provider**: Add your API key for at least one provider:
+   - **OpenRouter**: Get key from [openrouter.ai/keys](https://openrouter.ai/keys)
+   - **Anthropic**: Get key from [console.anthropic.com](https://console.anthropic.com/)
+   - **Google AI Studio**: Get key from [aistudio.google.com/app/apikey](https://aistudio.google.com/app/apikey)
+2. **Upload Document**: Choose a soil boring log file (PDF, PNG, JPG)
+3. **Select Analysis Method**:
+   - **CrewAI**: Two-agent system with quality control
+   - **LangGraph**: Single agent workflow
+   - **Unified Workflow**: Streamlined processing
+4. **Configure Options**: Set layer merging and splitting preferences
+5. **Analyze**: Get detailed soil analysis with interactive visualizations
+## 🔧 Supported File Formats
+- **PDF**: Soil boring log reports
+- **Images**: PNG, JPG, JPEG of soil boring logs
+## 🤖 LLM Providers
+### OpenRouter
+- Access to multiple models through one API
+- Recommended: Claude-4.0 Sonnet, GPT-4 Turbo
+### Anthropic Direct
+- Direct access to Claude models
+- Excellent for technical analysis
+### Google AI Studio
+- Direct access to Gemini models
+- Advanced multimodal capabilities
+## 📊 Analysis Methods
+### CrewAI (Recommended)
+- **Soil Expert Agent**: Specializes in soil classification
+- **Geotechnical Agent**: Focuses on engineering parameters
+- **Quality Control**: Two-agent validation system
+### LangGraph
+- Single agent with structured workflow
+- Good for straightforward analyses
+### Unified Workflow
+- Streamlined processing pipeline
+- Fast analysis with comprehensive validation
+## 🔬 Technical Features
+- **Su Detection**: Comprehensive undrained shear strength value extraction
+- **Layer Optimization**: Smart merging of similar layers
+- **Thick Layer Splitting**: Automatic subdivision of thick layers
+- **Unit Conversions**: Proper handling of different measurement units
+- **Professional Charts**: Publication-ready soil profile visualizations
+## 🎯 Use Cases
+- **Geotechnical Engineering**: Foundation design analysis
+- **Site Investigation**: Soil characterization studies
+- **Construction Planning**: Subsurface condition assessment
+- **Academic Research**: Soil mechanics studies
+- **Consulting**: Client report generation
+## 🔒 Privacy & Security
+- **No Data Storage**: Documents are processed in memory only
+- **API Key Security**: Keys are handled securely in environment variables
+- **Local Processing**: All analysis happens in your session
+## 🛠️ Technical Stack
+- **Frontend**: Streamlit
+- **AI Framework**: LangGraph, CrewAI
+- **LLM Integration**: OpenAI API, Anthropic API, Google AI
+- **Visualization**: Plotly, Matplotlib
+- **Document Processing**: PyPDF2, PIL
+## 📝 License
+This project is developed for geotechnical engineering applications with AI-powered analysis capabilities.

app.py ADDED Viewed

	@@ -0,0 +1,62 @@

+#!/usr/bin/env python3
+"""
+Soil Boring Log Analyzer - Hugging Face Spaces Version
+Optimized for deployment on Hugging Face Spaces with Streamlit
+"""
+import streamlit as st
+import os
+import shutil
+from pathlib import Path
+# Hugging Face Spaces Setup
+def setup_hf_environment():
+    """Setup environment for Hugging Face Spaces"""
+    # Create .env file from template if it doesn't exist
+    if not os.path.exists('.env') and os.path.exists('.env_template'):
+        shutil.copy('.env_template', '.env')
+        st.info("🔧 Environment template created. Please configure your API keys in the sidebar.")
+# Initialize HF environment
+setup_hf_environment()
+# Import main app after environment setup
+from app import main
+# Hugging Face Spaces Configuration
+st.set_page_config(
+    page_title="🏗️ Soil Boring Log Analyzer",
+    page_icon="🏗️",
+    layout="wide",
+    initial_sidebar_state="expanded",
+    menu_items={
+        'Get Help': 'https://huggingface.co/spaces/your-username/soil-boring-analyzer',
+        'Report a bug': 'https://huggingface.co/spaces/your-username/soil-boring-analyzer/discussions',
+        'About': """
+        # 🏗️ Soil Boring Log Analyzer
+        An AI-powered application for analyzing soil boring logs using multiple LLM providers.
+        **Features:**
+        - Multi-LLM Support (OpenRouter, Anthropic, Google)
+        - PDF/Image document processing
+        - Professional soil analysis
+        - Interactive visualizations
+        **Powered by:** Streamlit, LangGraph, CrewAI
+        """
+    }
+)
+# Add Hugging Face Spaces header
+if __name__ == "__main__":
+    with st.container():
+        st.markdown("""
+        <div style='text-align: center; padding: 1rem; background: linear-gradient(90deg, #ff6b6b, #4ecdc4); color: white; border-radius: 10px; margin-bottom: 1rem;'>
+            <h2>🏗️ Soil Boring Log Analyzer</h2>
+            <p>AI-Powered Geotechnical Analysis | Powered by Multiple LLM Providers</p>
+        </div>
+        """, unsafe_allow_html=True)
+    # Run main application
+    main()

config.py ADDED Viewed

	@@ -0,0 +1,216 @@

+import os
+from dotenv import load_dotenv
+load_dotenv()
+# LLM Provider Configuration
+LLM_PROVIDERS = {
+    "openrouter": {
+        "name": "OpenRouter",
+        "base_url": "https://openrouter.ai/api/v1",
+        "api_key_env": "OPENROUTER_API_KEY",
+        "description": "Access to multiple models through OpenRouter",
+        "supports_all_models": True
+    },
+    "anthropic": {
+        "name": "Anthropic Direct",
+        "base_url": "https://api.anthropic.com",
+        "api_key_env": "ANTHROPIC_API_KEY",
+        "description": "Direct access to Claude models",
+        "supports_all_models": False,
+        "supported_models": ["anthropic/claude-sonnet-4", "anthropic/claude-3.5-sonnet-20241022", "anthropic/claude-3-sonnet-20240229", "anthropic/claude-3-haiku-20240307", "anthropic/claude-3-opus-20240229"]
+    },
+    "google": {
+        "name": "Google AI Studio",
+        "base_url": "https://generativelanguage.googleapis.com",
+        "api_key_env": "GOOGLE_API_KEY",
+        "description": "Direct access to Gemini models",
+        "supports_all_models": False,
+        "supported_models": ["google/gemini-2.5-pro-preview-05-06", "google/gemini-pro-vision"]
+    }
+}
+# Default provider and model (can be None if no API key is set)
+DEFAULT_PROVIDER = None
+DEFAULT_MODEL = None
+def get_api_key(provider):
+    """Get API key for specified provider"""
+    return os.getenv(LLM_PROVIDERS[provider]["api_key_env"])
+def get_available_providers():
+    """Get list of providers with valid API keys"""
+    available = []
+    for provider_id, provider_info in LLM_PROVIDERS.items():
+        if get_api_key(provider_id):
+            available.append(provider_id)
+    return available
+def get_models_for_provider(provider_id):
+    """Get available models for a specific provider"""
+    available_models = {}
+    for model_id, model_info in AVAILABLE_MODELS.items():
+        if provider_id in model_info.get("providers", []):
+            available_models[model_id] = model_info
+    return available_models
+def get_default_provider_and_model():
+    """Get default provider and model based on available API keys"""
+    try:
+        available_providers = get_available_providers()
+        if not available_providers:
+            return None, None
+        # Prefer providers in order: anthropic, openrouter, google
+        preferred_order = ["anthropic", "openrouter", "google"]
+        selected_provider = None
+        for provider in preferred_order:
+            if provider in available_providers:
+                selected_provider = provider
+                break
+        if not selected_provider:
+            selected_provider = available_providers[0]
+        # Get a recommended model for this provider
+        available_models = get_models_for_provider(selected_provider)
+        recommended_models = {k: v for k, v in available_models.items() if v.get("recommended", False)}
+        if recommended_models:
+            selected_model = list(recommended_models.keys())[0]
+        elif available_models:
+            selected_model = list(available_models.keys())[0]
+        else:
+            selected_model = None
+        return selected_provider, selected_model
+    except Exception:
+        # If anything fails, return None values
+        return None, None
+# Available models for soil analysis (recommended for structured outputs)
+AVAILABLE_MODELS = {
+    # Claude Models (Excellent for technical analysis)
+    "anthropic/claude-sonnet-4": {
+        "name": "Claude-4.0 Sonnet",
+        "description": "Latest Claude model with superior reasoning and technical analysis",
+        "cost": "Medium",
+        "recommended": True,
+        "supports_images": True,
+        "providers": ["openrouter", "anthropic"]
+    },
+    "anthropic/claude-3.5-sonnet-20241022": {
+        "name": "Claude-3.5 Sonnet",
+        "description": "Previous Claude model, excellent reasoning and technical analysis",
+        "cost": "Medium",
+        "recommended": True,
+        "supports_images": True,
+        "providers": ["openrouter", "anthropic"]
+    },
+    "anthropic/claude-3-sonnet-20240229": {
+        "name": "Claude-3 Sonnet (Legacy)",
+        "description": "Previous version, balanced performance",
+        "cost": "Medium",
+        "recommended": False,
+        "supports_images": True,
+        "providers": ["openrouter", "anthropic"]
+    },
+    "anthropic/claude-3-haiku-20240307": {
+        "name": "Claude-3 Haiku",
+        "description": "Faster and cheaper, good for basic analysis",
+        "cost": "Low",
+        "recommended": False,
+        "supports_images": True,
+        "providers": ["openrouter", "anthropic"]
+    },
+    "anthropic/claude-3-opus-20240229": {
+        "name": "Claude-3 Opus",
+        "description": "Most capable legacy model, best for complex analysis",
+        "cost": "High",
+        "recommended": True,
+        "supports_images": True,
+        "providers": ["openrouter", "anthropic"]
+    },
+    # GPT Models (Good structured output)
+    "openai/gpt-4-turbo": {
+        "name": "GPT-4 Turbo",
+        "description": "Fast and capable, good JSON output",
+        "cost": "Medium",
+        "recommended": True,
+        "supports_images": True,
+        "providers": ["openrouter"]
+    },
+    "openai/gpt-3.5-turbo": {
+        "name": "GPT-3.5 Turbo",
+        "description": "Fast and cheap, basic analysis",
+        "cost": "Low",
+        "recommended": False,
+        "supports_images": False,
+        "providers": ["openrouter"]
+    },
+    # Specialized Models
+    "meta-llama/llama-3.1-70b-instruct": {
+        "name": "Llama-3.1 70B",
+        "description": "Open source, good performance",
+        "cost": "Low",
+        "recommended": False,
+        "supports_images": False,
+        "providers": ["openrouter"]
+    },
+    "mistralai/mixtral-8x7b-instruct": {
+        "name": "Mixtral 8x7B",
+        "description": "Good multilingual support",
+        "cost": "Low",
+        "recommended": False,
+        "supports_images": False,
+        "providers": ["openrouter"]
+    },
+    # xAI Models
+    "x-ai/grok-3-beta": {
+        "name": "xAI Grok 3",
+        "description": "Latest xAI model with advanced reasoning capabilities (text-only)",
+        "cost": "Medium",
+        "recommended": True,
+        "supports_images": False,
+        "providers": ["openrouter"]
+    },
+    # Google Models
+    "google/gemini-2.5-pro-preview-05-06": {
+        "name": "Gemini 2.5 Pro Preview",
+        "description": "Latest Google Gemini model with advanced multimodal capabilities",
+        "cost": "Medium",
+        "recommended": True,
+        "supports_images": True,
+        "providers": ["openrouter", "google"]
+    },
+    "google/gemini-pro-vision": {
+        "name": "Gemini Pro Vision",
+        "description": "Google's multimodal model optimized for vision tasks",
+        "cost": "Medium",
+        "recommended": False,
+        "supports_images": True,
+        "providers": ["openrouter", "google"]
+    }
+}
+SOIL_TYPES = {
+    "clay": ["soft clay", "medium clay", "stiff clay", "very stiff clay", "hard clay"],
+    "sand": ["loose sand", "medium dense sand", "dense sand", "very dense sand"],
+    "silt": ["soft silt", "medium silt", "stiff silt"],
+    "gravel": ["loose gravel", "dense gravel"],
+    "rock": ["weathered rock", "soft rock", "hard rock"]
+}
+STRENGTH_PARAMETERS = {
+    "clay": "Su (kPa)",
+    "sand": "SPT-N",
+    "silt": "SPT-N",
+    "gravel": "SPT-N",
+    "rock": "UCS (MPa)"
+}

crewai_agents.py ADDED Viewed

	@@ -0,0 +1,553 @@

+from crewai import Agent, Task, Crew, Process
+from typing import Dict, Any, List
+import json
+import os
+from llm_client import LLMClient
+from soil_analyzer import SoilLayerAnalyzer
+from pydantic import BaseModel, Field
+from config import LLM_PROVIDERS, get_default_provider_and_model
+class CrewAIGeotechSystem:
+    def __init__(self, model=None, api_key=None):
+        # Handle API key - if explicitly passed as empty string, use that (for mock mode)
+        if api_key == "":
+            self.api_key = ""
+        else:
+            self.api_key = api_key or ""
+        _, default_model = get_default_provider_and_model()
+        self.model = model or default_model
+        # Check if we have a valid API key
+        self.has_api_key = bool(self.api_key and self.api_key.strip())
+        if self.has_api_key:
+            # Initialize our working LLMClient for actual LLM calls
+            self.llm_client = LLMClient(model=self.model, api_key=self.api_key)
+            # We'll use direct LLM calls instead of CrewAI agents due to compatibility issues
+            self.geotech_agent = "Geotechnical Engineer (Direct LLM)"
+            self.senior_geotech_agent = "Senior Geotechnical Engineer (Direct LLM)"
+        else:
+            # No API key available - set to None to trigger mock mode
+            self.llm_client = None
+            self.geotech_agent = None
+            self.senior_geotech_agent = None
+    def _run_geotech_analysis_direct(self, soil_data: Dict[str, Any]) -> str:
+        """Run geotechnical analysis using direct LLM calls"""
+        analysis_prompt = f"""
+        As an experienced geotechnical engineer, analyze the following soil boring log data and provide a comprehensive geotechnical assessment:
+        DATA:
+        {json.dumps(soil_data, indent=2)}
+        CRITICAL UNIT CONVERSION REQUIREMENTS:
+        1. **Su (Undrained Shear Strength) Unit Conversions:**
+           - t/m² → kPa: MULTIPLY BY 9.81 (NOT 10!)
+           - ksc (kg/cm²) → kPa: multiply by 98.0
+           - psi → kPa: multiply by 6.895
+           - MPa → kPa: multiply by 1000
+           - tsf (tons/ft²) → kPa: multiply by 95.76
+        2. **Common Unit Errors to Check:**
+           - If Su values seem unusually high (>500 kPa for soft clay), check if units are incorrectly converted
+           - If Su values seem unusually low (<10 kPa for stiff clay), check if conversion factor was applied
+           - t/m² is commonly misconverted using factor of 10 instead of 9.81 - BE CAREFUL!
+        CRITICAL LAYER ANALYSIS REQUIREMENTS:
+        3. **Su Value Consistency Within Layers:**
+           - **EXAMINE each layer for Su value variations**
+           - If multiple Su values exist within a single layer, check for consistency
+           - **LAYER SPLITTING CRITERIA:**
+             * If Su values within a layer vary by >30%, consider splitting the layer
+             * If Su values show clear trend (increasing/decreasing), split at transition points
+             * If one Su value is >2x another Su value in same layer, MUST split the layer
+        4. **Layer Splitting Protocol:**
+           - Identify depth ranges where Su values are similar (within 20-30% variation)
+           - Create new layer boundaries at points where Su values change significantly
+           - Each new sub-layer should have consistent Su values (average or representative value)
+           - Update soil descriptions to reflect the new layer characteristics
+        5. **Su Value Assignment for Layers:**
+           - If Su values are consistent within layer: use average value
+           - If Su values vary significantly: split layer and assign representative values
+           - Document the original Su readings and how they were processed
+        Your responsibilities:
+        1. **CAREFULLY validate ALL unit conversions** - pay special attention to Su values
+        2. Check if any Su values are in t/m² and need conversion to kPa using factor 9.81
+        3. **ANALYZE Su value consistency within each layer**
+        4. **SPLIT layers where Su values vary significantly (>30% variation or >2x difference)**
+        5. Validate all geotechnical parameters for consistency and reasonableness
+        6. Check layer classifications and transitions
+        7. Verify strength parameter correlations (Su vs water content, SPT vs consistency)
+        8. Calculate layer statistics and identify any outliers
+        9. Ensure ALL unit conversions are correct (kPa, degrees, etc.)
+        10. Check depth continuity and layer boundaries
+        Focus on:
+        - **UNIT CONVERSION ACCURACY** (especially Su values from t/m² to kPa)
+        - **Su VALUE CONSISTENCY within layers** - split if values vary significantly
+        - Parameter consistency (e.g., soft clay should have low Su, high water content)
+        - Reasonable strength ranges for each soil type
+        - Proper calculation methodology
+        - Layer transition logic
+        LAYER SPLITTING DECISION TREE:
+        ✓ Check if multiple Su values exist in each layer
+        ✓ Calculate variation between Su values (max-min)/average
+        ✓ If variation >30% OR max/min ratio >2.0: SPLIT the layer
+        ✓ Create new layers with consistent Su values
+        ✓ Assign average Su value to each new sub-layer
+        ✓ Update layer descriptions and boundaries
+        UNIT CONVERSION VERIFICATION CHECKLIST:
+        ✓ Check if any Su values are in t/m² - if yes, multiply by 9.81 to get kPa
+        ✓ Verify Su values are reasonable for soil consistency (soft clay: 10-30 kPa, stiff clay: 50-100 kPa)
+        ✓ Check SPT N-values are reasonable for soil description
+        ✓ Ensure depth units are in meters
+        Provide detailed analysis with any concerns noted, especially unit conversion issues and layer splitting recommendations.
+        """
+        try:
+            # Use our working LLMClient
+            response = self.llm_client.client.chat.completions.create(
+                model=self.model,
+                messages=[
+                    {"role": "system", "content": "You are an experienced geotechnical engineer with expertise in soil mechanics and foundation design. You are particularly careful about unit conversions and have seen many errors caused by incorrect unit conversion factors."},
+                    {"role": "user", "content": analysis_prompt}
+                ],
+                temperature=0.1,
+                max_tokens=2000
+            )
+            return response.choices[0].message.content
+        except Exception as e:
+            return f"Analysis failed: {str(e)}"
+    def _run_senior_review_direct(self, analysis_result: str) -> str:
+        """Run senior engineer review using direct LLM calls"""
+        review_prompt = f"""
+        As a senior geotechnical engineer with 20+ years of experience, review the following geotechnical analysis for consistency, accuracy, and engineering reasonableness:
+        ANALYSIS TO REVIEW:
+        {analysis_result}
+        CRITICAL REVIEW FOCUS - UNIT CONVERSIONS:
+        Your PRIMARY responsibility is to catch unit conversion errors that can lead to catastrophic design failures.
+        1. **Su (Undrained Shear Strength) CRITICAL CHECKS:**
+           - Are any Su values still in t/m²? If yes, they MUST be converted to kPa using factor 9.81
+           - Are Su values reasonable for the soil consistency described?
+           - Soft clay: 10-30 kPa | Medium clay: 30-60 kPa | Stiff clay: 60-120 kPa | Very stiff clay: >120 kPa
+           - If Su > 500 kPa for soft clay → MAJOR RED FLAG - likely unit error
+           - If Su < 10 kPa for stiff clay → MAJOR RED FLAG - likely unit error
+        2. **COMMON UNIT CONVERSION ERRORS TO IDENTIFY:**
+           - Using factor 10 instead of 9.81 for t/m² → kPa conversion
+           - Missing unit conversions (values still in original units)
+           - Wrong conversion factors applied
+        CRITICAL LAYER SPLITTING REVIEW:
+        3. **Su Value Consistency Within Layers:**
+           - **EXAMINE if layers with varying Su values were properly split**
+           - Check if any single layer contains Su values that vary by >30%
+           - **LAYER SPLITTING VALIDATION:**
+             * Were layers with Su variation >30% properly split?
+             * Were layers with Su ratio >2.0 (max/min) properly split?
+             * Are the new layer boundaries logical and well-defined?
+             * Does each sub-layer have consistent Su values?
+        4. **Layer Splitting Quality Control:**
+           - Verify that split layers have representative Su values (average or appropriate)
+           - Check that layer descriptions match the Su values assigned
+           - Ensure depth boundaries are clearly defined for split layers
+           - Validate that soil consistency matches the Su values in each sub-layer
+        Your senior engineer review responsibilities:
+        1. **UNIT CONVERSION VALIDATION (HIGHEST PRIORITY):**
+           - Su vs water content relationships for clay soils
+           - SPT N-values vs soil consistency correlations
+           - Strength ranges within expected bounds for each soil type
+           - Unit conversion accuracy (kPa, degrees, m) - ESPECIALLY t/m² to kPa using 9.81
+        2. **LAYER SPLITTING VALIDATION (HIGH PRIORITY):**
+           - Check if layers with varying Su values were appropriately split
+           - Verify consistency of Su values within each layer
+           - Validate that layer boundaries make engineering sense
+        3. **PARAMETER CONSISTENCY CHECKS:**
+           - Layer boundaries and transitions are logical
+           - Classification consistency across depth
+           - Parameter ranges match soil descriptions
+        4. **RED FLAGS TO IDENTIFY:**
+           - **CRITICAL:** Su values in wrong units (t/m² not converted to kPa)
+           - **CRITICAL:** Su values unreasonable for consistency (high Su with soft clay, low Su with stiff clay)
+           - **CRITICAL:** Single layer with highly variable Su values (>30% variation) not split
+           - High water content with very high Su (unusual for clay)
+           - Low water content with very low Su
+           - Soft consistency with high SPT N-values
+           - Hard consistency with low SPT N-values
+           - Strength values that appear incorrectly converted
+        5. **DECISION CRITERIA:**
+           - If you find ANY unit conversion errors: **REJECT and require re-investigation**
+           - If Su values are inconsistent with soil consistency: **REJECT and require re-investigation**
+           - If layers with varying Su values were not properly split: **REJECT and require re-investigation**
+           - If parameters look reasonable: Approve with confidence assessment
+           - If minor concerns exist: Approve with notes
+        LAYER SPLITTING REVIEW CHECKLIST:
+        ✓ Are there any layers with multiple different Su values?
+        ✓ Were layers with Su variation >30% properly split?
+        ✓ Were layers with Su ratio >2.0 properly split?
+        ✓ Do split layers have consistent Su values within each sub-layer?
+        ✓ Are layer descriptions updated to match the split layers?
+        UNIT CONVERSION VERIFICATION CHECKLIST FOR REVIEW:
+        ✓ Are ALL Su values properly converted to kPa?
+        ✓ Are Su values reasonable for the described soil consistency?
+        ✓ Has the correct factor (9.81) been used for t/m² → kPa conversion?
+        ✓ Are SPT N-values consistent with soil descriptions?
+        Provide your professional judgment on whether this analysis is acceptable or requires revision.
+        Be EXTREMELY specific about any unit conversion issues and layer splitting issues found and provide clear guidance for correction.
+        REMEMBER: Unit conversion errors and improper layer definition can lead to foundation failures. Be thorough.
+        """
+        try:
+            # Use our working LLMClient
+            response = self.llm_client.client.chat.completions.create(
+                model=self.model,
+                messages=[
+                    {"role": "system", "content": "You are a senior geotechnical engineer with extensive experience in complex foundation projects and rigorous quality control. You have seen foundation failures caused by unit conversion errors and are extremely vigilant about this issue."},
+                    {"role": "user", "content": review_prompt}
+                ],
+                temperature=0.1,
+                max_tokens=2000
+            )
+            return response.choices[0].message.content
+        except Exception as e:
+            return f"Review failed: {str(e)}"
+    def _run_reinvestigation_direct(self, original_analysis: str, review_feedback: str) -> str:
+        """Run re-investigation using direct LLM calls"""
+        reinvestigation_prompt = f"""
+        Based on the senior engineer's review, re-investigate and address the following issues:
+        ORIGINAL ANALYSIS:
+        {original_analysis}
+        SENIOR REVIEW FEEDBACK:
+        {review_feedback}
+        RE-INVESTIGATION REQUIREMENTS:
+        **PRIORITY 1: UNIT CONVERSION CORRECTIONS**
+        If the senior engineer identified unit conversion issues:
+        1. **Su (Undrained Shear Strength) Corrections:**
+           - If values are in t/m², convert to kPa by multiplying by 9.81
+           - Double-check ALL conversion factors used
+           - Verify final Su values are reasonable for soil consistency
+        2. **Verify Conversion Factors:**
+           - t/m² → kPa: multiply by 9.81 (NOT 10)
+           - ksc → kPa: multiply by 98.0
+           - psi → kPa: multiply by 6.895
+           - MPa → kPa: multiply by 1000
+        **PRIORITY 2: LAYER SPLITTING CORRECTIONS**
+        If the senior engineer identified layer splitting issues:
+        1. **Su Value Variation Analysis:**
+           - Re-examine each layer for Su value consistency
+           - Calculate variation: (max Su - min Su) / average Su
+           - Calculate ratio: max Su / min Su
+        2. **Layer Splitting Protocol:**
+           - If Su variation >30% OR ratio >2.0: SPLIT the layer
+           - Create new layer boundaries at points where Su values change significantly
+           - Assign consistent Su values to each new sub-layer (use average within range)
+           - Update layer descriptions to reflect new boundaries
+        3. **New Layer Definition:**
+           - Each sub-layer should have Su values within 20-30% variation
+           - Update depth ranges for each new sub-layer
+           - Ensure soil consistency descriptions match Su values
+           - Verify layer transitions are logical
+        **GENERAL RE-INVESTIGATION:**
+        1. Address each specific concern raised by the senior engineer
+        2. Re-examine parameter relationships and correlations
+        3. Double-check ALL unit conversions and calculations
+        4. Provide revised analysis with explanations for changes
+        5. Justify any assumptions or interpretations made
+        **LAYER SPLITTING EXAMPLE:**
+        Original Layer: 2.0-6.0m, Clay, Su values: 15, 25, 45 kPa
+        Analysis: Variation = (45-15)/28 = 107% > 30%, Ratio = 45/15 = 3.0 > 2.0
+        Action: SPLIT INTO:
+        - Layer 2a: 2.0-4.0m, Soft Clay, Su = 20 kPa (average of 15, 25)
+        - Layer 2b: 4.0-6.0m, Medium Clay, Su = 45 kPa
+        **UNIT CONVERSION RE-CHECK PROTOCOL:**
+        ✓ Identify original units for each parameter
+        ✓ Apply correct conversion factors
+        ✓ Verify converted values are reasonable
+        ✓ Check consistency with soil descriptions
+        **LAYER SPLITTING RE-CHECK PROTOCOL:**
+        ✓ Check Su value variation within each layer
+        ✓ Split layers with >30% variation or >2.0 ratio
+        ✓ Assign representative Su values to sub-layers
+        ✓ Update layer descriptions and boundaries
+        ✓ Verify consistency between Su and soil consistency
+        Focus specifically on the issues identified in the review.
+        Provide a comprehensive revised analysis that addresses ALL concerns.
+        **Show your work:** For any corrections, clearly state:
+        - Original layer configuration
+        - Su values and their variation/ratio
+        - Splitting decision and rationale
+        - New layer boundaries and Su assignments
+        - Unit conversion details (original value, factor, final value)
+        - Verification that results are reasonable
+        """
+        try:
+            # Use our working LLMClient
+            response = self.llm_client.client.chat.completions.create(
+                model=self.model,
+                messages=[
+                    {"role": "system", "content": "You are an experienced geotechnical engineer conducting a thorough re-investigation based on senior engineer feedback. You are particularly focused on correcting any unit conversion errors identified."},
+                    {"role": "user", "content": reinvestigation_prompt}
+                ],
+                temperature=0.1,
+                max_tokens=2000
+            )
+            return response.choices[0].message.content
+        except Exception as e:
+            return f"Re-investigation failed: {str(e)}"
+    def _run_final_review_direct(self, revised_analysis: str, previous_concerns: str) -> str:
+        """Run final review using direct LLM calls"""
+        final_review_prompt = f"""
+        Conduct final review of the re-investigated analysis:
+        REVISED ANALYSIS:
+        {revised_analysis}
+        PREVIOUS CONCERNS:
+        {previous_concerns}
+        Final validation requirements:
+        1. Confirm all previous concerns have been adequately addressed
+        2. Verify parameter consistency and engineering reasonableness
+        3. Check that explanations are technically sound
+        4. Provide final approval or additional guidance if still needed
+        Make final determination: APPROVED or REQUIRES FURTHER WORK
+        """
+        try:
+            # Use our working LLMClient
+            response = self.llm_client.client.chat.completions.create(
+                model=self.model,
+                messages=[
+                    {"role": "system", "content": "You are a senior geotechnical engineer conducting final validation with authority to approve or reject the analysis."},
+                    {"role": "user", "content": final_review_prompt}
+                ],
+                temperature=0.1,
+                max_tokens=2000
+            )
+            return response.choices[0].message.content
+        except Exception as e:
+            return f"Final review failed: {str(e)}"
+    def run_geotechnical_analysis(self, soil_data: Dict[str, Any]) -> Dict[str, Any]:
+        """Run the complete two-agent geotechnical analysis workflow using direct LLM calls"""
+        # Handle case where no LLM is available (testing mode)
+        if not self.has_api_key:
+            return self._mock_analysis_for_testing(soil_data)
+        try:
+            # Run initial analysis using direct LLM call
+            analysis_result = self._run_geotech_analysis_direct(soil_data)
+            if "failed:" in analysis_result:
+                return {
+                    "error": f"Initial analysis failed: {analysis_result}",
+                    "status": "error",
+                    "workflow": "failed"
+                }
+            # Run review using direct LLM call
+            review_result = self._run_senior_review_direct(analysis_result)
+            if "failed:" in review_result:
+                return {
+                    "error": f"Review failed: {review_result}",
+                    "status": "error",
+                    "workflow": "failed"
+                }
+            # Check if re-investigation is needed based on review content
+            review_text = review_result.lower()
+            needs_reinvestigation = any(keyword in review_text for keyword in [
+                "re-investigate", "reject", "inconsistent", "unusual", "verify", "additional testing",
+                "requires revision", "not acceptable", "re-examination", "concerning", "requires further work"
+            ])
+            if needs_reinvestigation:
+                # Run re-investigation
+                final_analysis = self._run_reinvestigation_direct(analysis_result, review_result)
+                if "failed:" in final_analysis:
+                    return {
+                        "error": f"Re-investigation failed: {final_analysis}",
+                        "status": "error",
+                        "workflow": "failed"
+                    }
+                # Final review of re-investigation
+                final_review = self._run_final_review_direct(final_analysis, review_result)
+                if "failed:" in final_review:
+                    return {
+                        "error": f"Final review failed: {final_review}",
+                        "status": "error",
+                        "workflow": "failed"
+                    }
+                return {
+                    "initial_analysis": analysis_result,
+                    "initial_review": review_result,
+                    "reinvestigation": final_analysis,
+                    "final_review": final_review,
+                    "status": "completed_with_revision",
+                    "workflow": "two_stage_review_direct_llm"
+                }
+            else:
+                return {
+                    "analysis": analysis_result,
+                    "review": review_result,
+                    "status": "approved",
+                    "workflow": "single_stage_review_direct_llm"
+                }
+        except Exception as e:
+            return {
+                "error": f"CrewAI analysis failed: {str(e)}",
+                "status": "error",
+                "workflow": "failed"
+            }
+    def _mock_analysis_for_testing(self, soil_data: Dict[str, Any]) -> Dict[str, Any]:
+        """Provides mock analysis for testing when no API key is available"""
+        # Simulate comprehensive analysis based on input data
+        project_name = soil_data.get('project_info', {}).get('project_name', 'Unknown Project')
+        boring_id = soil_data.get('project_info', {}).get('boring_id', 'Unknown Boring')
+        soil_layers = soil_data.get('soil_layers', [])
+        # Generate realistic mock analysis
+        mock_analysis = f"""
+        GEOTECHNICAL ANALYSIS REPORT - {project_name} ({boring_id})
+        EXECUTIVE SUMMARY:
+        Analyzed {len(soil_layers)} soil layers from the boring log data.
+        Overall soil conditions appear consistent with typical soil behavior patterns.
+        LAYER ANALYSIS:
+        """
+        for i, layer in enumerate(soil_layers, 1):
+            soil_type = layer.get('soil_type', 'Unknown')
+            consistency = layer.get('consistency', 'Unknown')
+            depth_from = layer.get('depth_from', 0)
+            depth_to = layer.get('depth_to', 0)
+            strength_value = layer.get('strength_value', 'N/A')
+            strength_unit = layer.get('strength_unit', '')
+            mock_analysis += f"""
+        Layer {i} ({depth_from}-{depth_to}m): {soil_type.title()}
+        - Consistency: {consistency}
+        - Strength: {strength_value} {strength_unit}
+        - Classification appears reasonable for {consistency} {soil_type}
+        - Depth continuity validated ✓
+            """
+        mock_analysis += """
+        VALIDATION CHECKS:
+        ✓ Layer depth continuity confirmed
+        ✓ Strength parameters within expected ranges
+        ✓ Soil classification consistency verified
+        ✓ Unit conversions validated
+        RECOMMENDATIONS:
+        - Data appears consistent and suitable for preliminary design
+        - Standard geotechnical correlations apply
+        - Consider additional testing for final design if required
+        """
+        mock_review = f"""
+        SENIOR ENGINEER REVIEW - {project_name}
+        I have reviewed the geotechnical analysis and findings:
+        TECHNICAL REVIEW:
+        - Analysis methodology is sound and follows standard practices
+        - Parameter correlations are reasonable and well-documented
+        - Soil classification is consistent with strength parameters
+        - Depth boundaries and layer transitions are appropriate
+        QUALITY ASSURANCE:
+        - All calculations have been verified
+        - Unit conversions are correct
+        - Data consistency checks passed
+        - Engineering correlations within acceptable ranges
+        APPROVAL STATUS: ✅ APPROVED
+        The analysis meets professional standards and is suitable for use in geotechnical design.
+        No re-investigation required at this time.
+        Senior Geotechnical Engineer Review Complete.
+        """
+        return {
+            "status": "approved",
+            "workflow": "two_agent_review",
+            "analysis": mock_analysis.strip(),
+            "review": mock_review.strip(),
+            "summary": f"Mock analysis completed for {len(soil_layers)} soil layers. All parameters validated.",
+            "timestamp": "2024-06-26T10:00:00Z",
+            "system": "CrewAI Mock System (No API Key)",
+            "reinvestigation_required": False
+            }
+# Example usage function
+def analyze_soil_with_crewai(soil_data: Dict[str, Any]) -> Dict[str, Any]:
+    """Main function to run CrewAI-based geotechnical analysis"""
+    system = CrewAIGeotechSystem()
+    return system.run_geotechnical_analysis(soil_data)

document_processor.py ADDED Viewed

	@@ -0,0 +1,111 @@

+import PyPDF2
+from PIL import Image
+import base64
+import io
+import streamlit as st
+try:
+    from pdf2image import convert_from_path
+    PDF2IMAGE_AVAILABLE = True
+except ImportError:
+    PDF2IMAGE_AVAILABLE = False
+    st.warning("⚠️ pdf2image not available. PDF to image conversion will be limited.")
+class DocumentProcessor:
+    def __init__(self):
+        pass
+    def extract_text_from_pdf(self, pdf_file):
+        """Extract text content from PDF file"""
+        try:
+            pdf_reader = PyPDF2.PdfReader(pdf_file)
+            text = ""
+            for page in pdf_reader.pages:
+                text += page.extract_text() + "\n"
+            return text
+        except Exception as e:
+            st.error(f"Error extracting text from PDF: {str(e)}")
+            return None
+    def convert_pdf_to_images(self, pdf_file):
+        """Convert PDF pages to images"""
+        if not PDF2IMAGE_AVAILABLE:
+            st.warning("PDF to image conversion not available. Install poppler-utils and pdf2image.")
+            return None
+        try:
+            images = convert_from_path(pdf_file, dpi=200)
+            return images
+        except Exception as e:
+            st.error(f"Error converting PDF to images: {str(e)}")
+            return None
+    def image_to_base64(self, image):
+        """Convert PIL image to base64 string for API"""
+        try:
+            if isinstance(image, str):
+                with open(image, "rb") as img_file:
+                    return base64.b64encode(img_file.read()).decode('utf-8')
+            else:
+                buffered = io.BytesIO()
+                image.save(buffered, format="PNG")
+                return base64.b64encode(buffered.getvalue()).decode('utf-8')
+        except Exception as e:
+            st.error(f"Error converting image to base64: {str(e)}")
+            return None
+    def process_uploaded_file(self, uploaded_file):
+        """Process uploaded file (PDF or image)"""
+        if uploaded_file is None:
+            return None, None, None
+        file_type = uploaded_file.type
+        if file_type == "application/pdf":
+            # Extract text
+            text_content = self.extract_text_from_pdf(uploaded_file)
+            # Convert to images for visual analysis (if available)
+            images = None
+            image_base64 = None
+            if PDF2IMAGE_AVAILABLE:
+                try:
+                    import tempfile
+                    import os
+                    # Use temporary file to avoid conflicts
+                    with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as temp_pdf:
+                        temp_pdf.write(uploaded_file.getbuffer())
+                        temp_pdf_path = temp_pdf.name
+                    try:
+                        images = self.convert_pdf_to_images(temp_pdf_path)
+                        # Convert first page to base64 for LLM analysis
+                        if images and len(images) > 0:
+                            image_base64 = self.image_to_base64(images[0])
+                    finally:
+                        # Clean up temporary file
+                        if os.path.exists(temp_pdf_path):
+                            os.unlink(temp_pdf_path)
+                except Exception as e:
+                    st.warning(f"PDF to image conversion failed: {str(e)}. Using text analysis only.")
+            return text_content, images, image_base64
+        elif file_type in ["image/jpeg", "image/png", "image/jpg"]:
+            # For image files
+            try:
+                image = Image.open(uploaded_file)
+                image_base64 = self.image_to_base64(image)
+                return None, [image], image_base64
+            except Exception as e:
+                st.error(f"Error processing image file: {str(e)}")
+                return None, None, None
+        else:
+            st.error("Unsupported file type. Please upload PDF or image files.")
+            return None, None, None

langgraph_agent.py ADDED Viewed

	@@ -0,0 +1,214 @@

+from langgraph.graph import StateGraph, END
+from langchain.schema import BaseMessage, HumanMessage, AIMessage
+from typing import TypedDict, List, Dict, Any
+import json
+from llm_client import LLMClient
+from soil_analyzer import SoilLayerAnalyzer
+class AgentState(TypedDict):
+    messages: List[BaseMessage]
+    soil_data: Dict[str, Any]
+    analysis_results: Dict[str, Any]
+    user_feedback: str
+    current_task: str
+    iteration_count: int
+    text_content: str
+    image_base64: str
+class SoilAnalysisAgent:
+    def __init__(self):
+        # Initialize with None client - will be set when needed
+        self.llm_client = None
+        self.soil_analyzer = SoilLayerAnalyzer()
+        self.graph = self._create_graph()
+    def _create_graph(self):
+        """Create the LangGraph workflow"""
+        workflow = StateGraph(AgentState)
+        # Add nodes
+        workflow.add_node("analyze_document", self._analyze_document)
+        workflow.add_node("validate_layers", self._validate_layers)
+        workflow.add_node("optimize_layers", self._optimize_layers)
+        workflow.add_node("generate_insights", self._generate_insights)
+        workflow.add_node("handle_feedback", self._handle_feedback)
+        # Add edges
+        workflow.add_edge("analyze_document", "validate_layers")
+        workflow.add_edge("validate_layers", "optimize_layers")
+        workflow.add_edge("optimize_layers", "generate_insights")
+        workflow.add_conditional_edges(
+            "generate_insights",
+            self._should_handle_feedback,
+            {
+                "feedback": "handle_feedback",
+                "end": END
+            }
+        )
+        workflow.add_edge("handle_feedback", "validate_layers")
+        # Set entry point
+        workflow.set_entry_point("analyze_document")
+        return workflow.compile()
+    def _analyze_document(self, state: AgentState) -> AgentState:
+        """Analyze the soil boring log document"""
+        # Extract document content from state
+        document_content = state.get("text_content")
+        image_content = state.get("image_base64")
+        # Analyze using LLM
+        soil_data = self.llm_client.analyze_soil_boring_log(
+            text_content=document_content,
+            image_base64=image_content
+        )
+        state["soil_data"] = soil_data
+        state["current_task"] = "document_analysis"
+        state["messages"].append(AIMessage(content="Document analysis completed"))
+        return state
+    def _validate_layers(self, state: AgentState) -> AgentState:
+        """Validate soil layer continuity and consistency"""
+        soil_data = state["soil_data"]
+        if "soil_layers" in soil_data:
+            # Validate layer continuity
+            validated_layers = self.soil_analyzer.validate_layer_continuity(
+                soil_data["soil_layers"]
+            )
+            soil_data["soil_layers"] = validated_layers
+            # Calculate statistics
+            stats = self.soil_analyzer.calculate_layer_statistics(validated_layers)
+            state["analysis_results"] = {"validation_stats": stats}
+        state["current_task"] = "layer_validation"
+        state["messages"].append(AIMessage(content="Layer validation completed"))
+        return state
+    def _optimize_layers(self, state: AgentState) -> AgentState:
+        """Optimize layer division by merging/splitting as needed"""
+        soil_data = state["soil_data"]
+        if "soil_layers" in soil_data:
+            optimization_results = self.soil_analyzer.optimize_layer_division(
+                soil_data["soil_layers"]
+            )
+            state["analysis_results"]["optimization"] = optimization_results
+        state["current_task"] = "layer_optimization"
+        state["messages"].append(AIMessage(content="Layer optimization completed"))
+        return state
+    def _generate_insights(self, state: AgentState) -> AgentState:
+        """Generate insights and recommendations"""
+        soil_data = state["soil_data"]
+        analysis_results = state["analysis_results"]
+        # Generate insights using LLM
+        insights_prompt = f"""
+        Based on the soil boring log analysis, provide geotechnical insights and recommendations:
+        Soil Data: {json.dumps(soil_data, indent=2)}
+        Analysis Results: {json.dumps(analysis_results, indent=2)}
+        Please provide:
+        1. Key geotechnical findings
+        2. Foundation recommendations
+        3. Construction considerations
+        4. Potential risks or concerns
+        5. Recommended additional testing
+        """
+        try:
+            response = self.llm_client.client.chat.completions.create(
+                model=self.llm_client.model,
+                messages=[{"role": "user", "content": insights_prompt}],
+                max_tokens=1000,
+                temperature=0.3
+            )
+            insights = response.choices[0].message.content
+            state["analysis_results"]["insights"] = insights
+        except Exception as e:
+            state["analysis_results"]["insights"] = f"Error generating insights: {str(e)}"
+        state["current_task"] = "insight_generation"
+        state["messages"].append(AIMessage(content="Insights generation completed"))
+        return state
+    def _handle_feedback(self, state: AgentState) -> AgentState:
+        """Handle user feedback and refine analysis"""
+        user_feedback = state.get("user_feedback", "")
+        soil_data = state["soil_data"]
+        if user_feedback:
+            # Refine soil layers based on feedback
+            refined_data = self.llm_client.refine_soil_layers(soil_data, user_feedback)
+            if "error" not in refined_data:
+                state["soil_data"] = refined_data
+        state["current_task"] = "feedback_handling"
+        state["iteration_count"] = state.get("iteration_count", 0) + 1
+        state["messages"].append(AIMessage(content=f"Feedback processed (iteration {state['iteration_count']})"))
+        return state
+    def _should_handle_feedback(self, state: AgentState) -> str:
+        """Determine if feedback should be handled"""
+        if state.get("user_feedback") and state.get("iteration_count", 0) < 3:
+            return "feedback"
+        return "end"
+    def run_analysis(self, text_content=None, image_base64=None, user_feedback=None):
+        """Run the complete soil analysis workflow"""
+        # Prepare initial state - store content in state instead of message
+        initial_message = HumanMessage(content="Starting soil boring log analysis")
+        initial_state = {
+            "messages": [initial_message],
+            "soil_data": {},
+            "analysis_results": {},
+            "user_feedback": user_feedback or "",
+            "current_task": "initialization",
+            "iteration_count": 0,
+            "text_content": text_content,
+            "image_base64": image_base64
+        }
+        # Run the graph
+        result = self.graph.invoke(initial_state)
+        return {
+            "soil_data": result["soil_data"],
+            "analysis_results": result["analysis_results"],
+            "messages": result["messages"],
+            "current_task": result["current_task"],
+            "iteration_count": result["iteration_count"]
+        }
+    def process_feedback(self, current_state, feedback):
+        """Process user feedback and continue analysis"""
+        current_state["user_feedback"] = feedback
+        # Continue from feedback handling
+        result = self.graph.invoke(current_state, {"recursion_limit": 10})
+        return {
+            "soil_data": result["soil_data"],
+            "analysis_results": result["analysis_results"],
+            "messages": result["messages"],
+            "current_task": result["current_task"],
+            "iteration_count": result["iteration_count"]
+        }

llm_client.py ADDED Viewed

	@@ -0,0 +1,588 @@

+import openai
+import json
+import streamlit as st
+from config import LLM_PROVIDERS, AVAILABLE_MODELS, get_default_provider_and_model
+from soil_calculations import SoilCalculations
+class LLMClient:
+    def __init__(self, model=None, api_key=None, provider=None):
+        # Get defaults if not provided
+        if not provider or not model:
+            default_provider, default_model = get_default_provider_and_model()
+            self.provider = provider or default_provider
+            self.model = model or default_model
+        else:
+            self.provider = provider
+            self.model = model
+        self.api_key = api_key
+        # Only create client if we have API key and provider
+        if not self.api_key or not self.provider:
+            self.client = None
+            self.calculator = SoilCalculations()
+            return
+        # Get provider configuration
+        provider_config = LLM_PROVIDERS.get(self.provider, {})
+        base_url = provider_config.get("base_url", "https://openrouter.ai/api/v1")
+        self.client = openai.OpenAI(
+            base_url=base_url,
+            api_key=self.api_key,
+        )
+        self.calculator = SoilCalculations()
+    def _supports_images(self) -> bool:
+        """Check if the current model supports image inputs"""
+        model_info = AVAILABLE_MODELS.get(self.model, {})
+        return model_info.get('supports_images', False)
+    def analyze_soil_boring_log(self, text_content=None, image_base64=None):
+        """Analyze soil boring log using LLM"""
+        # Standardize units in text content before analysis
+        if text_content:
+            text_content, unit_conversions = self.calculator.standardize_units(text_content)
+            if unit_conversions:
+                st.info(f"📏 Converted units: {', '.join([f'{k}→{v}' for k, v in unit_conversions.items()])}")
+        system_prompt = """You are an expert geotechnical engineer specializing in soil boring log interpretation.
+        IMPORTANT: You must respond with ONLY valid JSON data. Do not include any text before or after the JSON.
+        SAMPLE TYPE IDENTIFICATION (CRITICAL - FOLLOW EXACT ORDER):
+        **STEP 1 - FIRST COLUMN STRATIFICATION SYMBOLS (ABSOLUTE HIGHEST PRIORITY):**
+        ALWAYS look at the FIRST COLUMN of each layer for stratification symbols:
+        - **SS-1, SS-2, SS-18, SS18, SS-5** → SS (Split Spoon) sample
+        - **ST-1, ST-2, ST-5, ST5, ST-12** → ST (Shelby Tube) sample
+        - **SS1, SS2, SS3** (without dash) → SS sample
+        - **ST1, ST2, ST3** (without dash) → ST sample
+        - **Look for pattern: [SS|ST][-]?[0-9]+** in first column
+        **EXAMPLES of First Column Recognition:**
+        ```
+        SS-18 | Brown clay, N=8 → sample_type="SS" (SS-18 in first column)
+        ST-5  | Gray clay, Su=45 kPa → sample_type="ST" (ST-5 in first column)
+        SS12  | Sandy clay, SPT test → sample_type="SS" (SS12 in first column)
+        ST3   | Soft clay, unconfined → sample_type="ST" (ST3 in first column)
+        ```
+        **STEP 2 - If NO first column symbols, then check description keywords:**
+        - SS indicators: "split spoon", "SPT", "standard penetration", "disturbed"
+        - ST indicators: "shelby", "tube", "undisturbed", "UT", "unconfined compression"
+        **STEP 3 - If still unclear, use strength parameter type:**
+        - SPT-N values present → likely SS sample
+        - Su values from unconfined test → likely ST sample
+        CRITICAL SOIL CLASSIFICATION RULES (MANDATORY):
+        **SAND LAYER CLASSIFICATION REQUIREMENTS:**
+        1. **Sand layers MUST have sieve analysis evidence** - Look for:
+           - "Sieve #200: X% passing" or "#200 passing: X%"
+           - "Fines content: X%" (same as sieve #200)
+           - "Particle size analysis" or "gradation test"
+           - "% passing 0.075mm" (equivalent to #200 sieve)
+        2. **Classification Rules**:
+           - Sieve #200 >50% passing → CLAY (fine-grained)
+           - Sieve #200 <50% passing → SAND/GRAVEL (coarse-grained)
+        3. **NO SIEVE ANALYSIS = ASSUME CLAY (MANDATORY)**:
+           - If no sieve analysis data found → ALWAYS classify as CLAY
+           - Include note: "Assumed clay - no sieve analysis data available"
+           - Set sieve_200_passing: null (not a number)
+        **CRITICAL**: Never classify as sand/silt without explicit sieve analysis evidence
+        **CRITICAL**: Always look for sieve #200 data before classifying as sand
+        CRITICAL SS/ST SAMPLE RULES (MUST FOLLOW):
+        FOR SS (Split Spoon) SAMPLES:
+        1. ALWAYS use RAW N-VALUE (not N-corrected, N-correction, or adjusted N)
+        2. Look for: "N = 15", "SPT-N = 8", "raw N = 20", "field N = 12"
+        3. IGNORE: "N-corrected = 25", "N-correction = 18", "adjusted N = 30"
+        4. For clay: Use SPT-N parameter (will be converted to Su using Su=5*N)
+        5. For sand/silt: Use SPT-N parameter (will be converted to friction angle)
+        6. NEVER use unconfined compression Su values for SS samples - ONLY use N values
+        FOR ST (Shelby Tube) SAMPLES:
+        1. ALWAYS USE DIRECT Su values from unconfined compression test
+        2. If ST sample has Su value (e.g., "Su = 25 kPa"), use that EXACT value
+        3. NEVER convert SPT-N to Su for ST samples when direct Su is available
+        4. Priority: Direct Su measurement > any other value
+        EXTRACTION PRIORITY FOR SS SAMPLES:
+        1. Raw N, Field N, Measured N (highest priority)
+        2. N-value without "corrected" or "correction" terms
+        3. General SPT-N value (lowest priority)
+        4. NEVER use Su from unconfined compression for SS samples
+        CRITICAL UNIT CONVERSION REQUIREMENTS (MUST APPLY):
+        **MANDATORY SU UNIT CONVERSION - READ FROM IMAGE/FILE:**
+        When extracting Su values from images or text, you MUST convert to kPa BEFORE using the value:
+        1. **ksc or kg/cm²**: Su_kPa = Su_ksc × 98.0
+           Example: "Su = 2.5 ksc" → strength_value: 245 (not 2.5)
+        2. **t/m² (tonnes/m²)**: Su_kPa = Su_tonnes × 9.81
+           Example: "Su = 3.0 t/m²" → strength_value: 29.43 (not 3.0)
+        3. **psi**: Su_kPa = Su_psi × 6.895
+           Example: "Su = 50 psi" → strength_value: 344.75 (not 50)
+        4. **psf**: Su_kPa = Su_psf × 0.048
+           Example: "Su = 1000 psf" → strength_value: 48 (not 1000)
+        5. **kPa**: Use directly (no conversion needed)
+           Example: "Su = 75 kPa" → strength_value: 75
+        6. **MPa**: Su_kPa = Su_MPa × 1000
+           Example: "Su = 0.1 MPa" → strength_value: 100 (not 0.1)
+        **IMPORTANT**: Always include original unit in description for verification
+        **SPT-N values**: Keep as-is (no unit conversion needed)
+        CRITICAL SU-WATER CONTENT VALIDATION (MANDATORY):
+        **EXTRACT WATER CONTENT WHEN AVAILABLE:**
+        Always extract water content (w%) when mentioned in the description:
+        - \"water content = 25%\" → water_content: 25
+        - \"w = 30%\" → water_content: 30
+        - \"moisture content 35%\" → water_content: 35
+        **VALIDATE SU-WATER CONTENT CORRELATION:**
+        For clay layers, Su and water content should correlate reasonably:
+        - Very soft clay: Su < 25 kPa, w% > 40%
+        - Soft clay: Su 25-50 kPa, w% 30-40%
+        - Medium clay: Su 50-100 kPa, w% 20-30%
+        - Stiff clay: Su 100-200 kPa, w% 15-25%
+        - Very stiff clay: Su 200-400 kPa, w% 10-20%
+        - Hard clay: Su > 400 kPa, w% < 15%
+        **CRITICAL UNIT CHECK SCENARIOS:**
+        - If Su > 1000 kPa with w% > 20%: CHECK if Su is in wrong units (psi, psf?)
+        - If Su < 5 kPa with w% < 15%: CHECK if Su is in wrong units (MPa, bar?)
+        - If correlation seems very off: VERIFY unit conversion was applied correctly
+        CRITICAL OUTPUT FORMAT (MANDATORY):
+        You MUST respond with ONLY a valid JSON object. Do not include:
+        - Explanatory text before or after the JSON
+        - Markdown formatting (```json ```)
+        - Comments or notes
+        - Multiple JSON objects
+        Start your response directly with { and end with }
+        LAYER GROUPING REQUIREMENTS:
+        1. MAXIMUM 7 LAYERS TOTAL - Group similar adjacent layers to achieve this limit
+        2. CLAY AND SAND MUST BE SEPARATE - Never combine clay layers with sand layers
+        3. Group adjacent layers with similar properties (same soil type and similar consistency)
+        4. Prioritize engineering significance over minor variations
+        Analyze the provided soil boring log and extract the following information in this exact JSON format:
+        {
+            "project_info": {
+                "project_name": "string",
+                "boring_id": "string",
+                "location": "string",
+                "date": "string",
+                "depth_total": 10.0
+            },
+            "soil_layers": [
+                {
+                    "layer_id": 1,
+                    "depth_from": 0.0,
+                    "depth_to": 2.5,
+                    "soil_type": "clay",
+                    "description": "Brown silty clay, ST sample, Su = 25 kPa",
+                    "sample_type": "ST",
+                    "strength_parameter": "Su",
+                    "strength_value": 25,
+                    "sieve_200_passing": 65,
+                    "water_content": 35.5,
+                    "color": "brown",
+                    "moisture": "moist",
+                    "consistency": "soft",
+                    "su_source": "Unconfined Compression Test"
+                }
+            ],
+            "water_table": {
+                "depth": 3.0,
+                "date_encountered": "2024-01-01"
+            },
+            "notes": "Additional observations"
+        }
+        EXAMPLES OF CORRECT PROCESSING WITH UNIT CONVERSION AND SOIL CLASSIFICATION:
+        **SS SAMPLE EXAMPLES:**
+        1. "SS-18: Clay layer, N = 8, Su = 45 kPa from unconfined test"
+           → Use: sample_type="SS", strength_parameter="SPT-N", strength_value=8
+           → IGNORE the Su=45 kPa value for SS samples
+        2. "SS18: Soft clay, field N = 6, N-corrected = 10"
+           → Use: sample_type="SS", strength_parameter="SPT-N", strength_value=6 (raw N)
+           → IGNORE N-corrected value
+        **ST SAMPLE EXAMPLES WITH UNIT CONVERSION:**
+        1. "ST-5: Stiff clay, Su = 85 kPa from unconfined compression"
+           → Use: sample_type="ST", strength_parameter="Su", strength_value=85
+        2. "ST-12: Medium clay, Su = 2.5 ksc from unconfined test"
+           → Convert: 2.5 × 98 = 245 kPa
+           → Use: sample_type="ST", strength_parameter="Su", strength_value=245
+        3. "ST sample: Clay, unconfined strength = 3.0 t/m²"
+           → Convert: 3.0 × 9.81 = 29.43 kPa
+           → Use: sample_type="ST", strength_parameter="Su", strength_value=29.43
+        **SOIL CLASSIFICATION EXAMPLES:**
+        1. "Brown silty clay, no sieve analysis data"
+           → soil_type="clay", sieve_200_passing=null
+           → Note: "Assumed clay - no sieve analysis data available"
+        2. "Sandy clay, sieve #200: 75% passing"
+           → soil_type="clay", sieve_200_passing=75
+           → Classification: Clay (>50% passing)
+        3. "Medium sand, gradation test shows 25% passing #200"
+           → soil_type="sand", sieve_200_passing=25
+           → Classification: Sand (<50% passing)
+        4. "Dense sand layer" (NO sieve data mentioned)
+           → soil_type="clay", sieve_200_passing=null
+           → Note: "Assumed clay - no sieve analysis data available"
+           → NEVER classify as sand without sieve data
+        CRITICAL LAYER GROUPING RULES:
+        1. MAXIMUM 7 LAYERS - If you identify more than 7 distinct zones, group adjacent similar layers
+        2. SEPARATE CLAY/SAND - Never group clay with sand, silt, or gravel layers
+        3. Group similar adjacent layers:
+           - Combine "soft clay" + "soft clay" into one "soft clay" layer
+           - Combine "medium sand" + "medium sand" into one "medium sand" layer
+           - Combine layers with similar strength values (within 30% difference)
+        4. Maintain engineering significance:
+           - Keep layers with significantly different strength parameters separate
+           - Preserve important transitions (e.g., clay to sand interface)
+           - Maintain water table interfaces as layer boundaries when significant
+        TECHNICAL RULES:
+        1. All numeric values must be numbers, not strings
+        2. For soil_type, use basic terms: "clay", "sand", "silt", "gravel" - do NOT include consistency
+        3. Include sample_type field: "SS" (Split Spoon) or "ST" (Shelby Tube)
+        4. Include sieve_200_passing field when available (percentage passing sieve #200)
+        5. Include water_content field when available (percentage water content for clay consistency checks)
+        6. Include su_source field: "Unconfined Compression Test" for direct measurements, or "Calculated from SPT-N" for conversions
+        7. Strength parameters:
+           - SS samples: ALWAYS use "SPT-N" with RAW N-value (will be converted based on soil type)
+           - ST samples with clay: Use "Su" with DIRECT value in kPa from unconfined compression test
+           - For sand/gravel: Always use "SPT-N" with N-value
+           - NEVER use Su for SS samples, NEVER calculate Su from SPT-N for ST samples that have direct Su
+        8. Put consistency separately in "consistency" field: "soft", "medium", "stiff", "loose", "dense", etc.
+        9. Ensure continuous depths (no gaps or overlaps)
+        10. All depths in meters, strength values as numbers
+        11. Return ONLY the JSON object, no additional text
+        GROUPING EXAMPLES:
+        - Original: [0-2m soft clay, 2-4m soft clay, 4-6m medium sand, 6-8m medium sand]
+        - Grouped: [0-4m soft clay, 4-8m medium sand] (4 layers reduced to 2)
+        STRENGTH PARAMETER EXAMPLES:
+        - SS sample: "Clay, N = 8 blows, Su = 40 kPa unconfined" → Use SPT-N = 8 (IGNORE Su for SS)
+        - ST sample: "Clay, Su = 45 kPa from unconfined test" → Use Su = 45 (DIRECT measurement)
+        - SS sample: "Clay, field N = 12, N-corrected = 18" → Use SPT-N = 12 (raw N, IGNORE corrected)"""
+        messages = [{"role": "system", "content": system_prompt}]
+        # Check if model supports images
+        supports_images = self._supports_images()
+        if text_content:
+            messages.append({
+                "role": "user",
+                "content": f"Please analyze this soil boring log text:\n\n{text_content}"
+            })
+        if image_base64 and supports_images:
+            messages.append({
+                "role": "user",
+                "content": [
+                    {
+                        "type": "text",
+                        "text": "Please analyze this soil boring log image:"
+                    },
+                    {
+                        "type": "image_url",
+                        "image_url": {
+                            "url": f"data:image/png;base64,{image_base64}"
+                        }
+                    }
+                ]
+            })
+        elif image_base64 and not supports_images:
+            # Model doesn't support images, notify user and continue with text-only
+            model_name = AVAILABLE_MODELS.get(self.model, {}).get('name', self.model)
+            st.warning(f"⚠️ {model_name} doesn't support image analysis. Using text content only.")
+            if not text_content:
+                st.error("❌ No text content available for analysis. Please ensure your document has extractable text or use a model that supports images.")
+                return {"error": "No text content available and model doesn't support images"}
+        try:
+            response = self.client.chat.completions.create(
+                model=self.model,
+                messages=messages,
+                max_tokens=2000,
+                temperature=0.1
+            )
+            content = response.choices[0].message.content
+            # Try to extract JSON from response
+            try:
+                # Try different JSON extraction methods
+                json_str = content.strip()
+                # Remove markdown code blocks if present
+                if "```json" in json_str:
+                    json_start = json_str.find("```json") + 7
+                    json_end = json_str.find("```", json_start)
+                    json_str = json_str[json_start:json_end].strip()
+                elif "```" in json_str:
+                    # Remove any code blocks
+                    json_start = json_str.find("```") + 3
+                    json_end = json_str.rfind("```")
+                    if json_end > json_start:
+                        json_str = json_str[json_start:json_end].strip()
+                # Find JSON object boundaries
+                if not json_str.startswith("{"):
+                    start_idx = json_str.find("{")
+                    if start_idx != -1:
+                        json_str = json_str[start_idx:]
+                if not json_str.endswith("}"):
+                    end_idx = json_str.rfind("}")
+                    if end_idx != -1:
+                        json_str = json_str[:end_idx + 1]
+                # Parse JSON
+                result = json.loads(json_str)
+                # Validate required structure
+                if "soil_layers" not in result:
+                    result["soil_layers"] = []
+                if "project_info" not in result:
+                    result["project_info"] = {}
+                # Validate and enhance soil classification
+                result = self.calculator.validate_soil_classification(result)
+                # Enhance layers with calculated parameters
+                if result["soil_layers"]:
+                    result["soil_layers"] = self.calculator.enhance_soil_layers(result["soil_layers"])
+                # Process with SS/ST classification
+                result = self.calculator.process_with_ss_st_classification(result)
+                # Enforce 7-layer limit and clay/sand separation
+                result["soil_layers"] = self._enforce_layer_grouping_rules(result["soil_layers"])
+                return result
+            except json.JSONDecodeError as e:
+                st.error(f"Failed to parse LLM response as JSON: {str(e)}")
+                # Try to create a basic structure from the response
+                return self._fallback_parse(content)
+        except Exception as e:
+            error_msg = str(e)
+            # Check for model availability error
+            if "not a valid model ID" in error_msg:
+                st.error(f"❌ Model '{self.model}' is not available on OpenRouter")
+                st.info("💡 Try switching to a different model in the sidebar (Claude-3.5 Sonnet or GPT-4 Turbo are recommended)")
+                return {"error": f"Model not available: {self.model}"}
+            else:
+                st.error(f"Error calling LLM API: {error_msg}")
+                return {"error": error_msg}
+    def _fallback_parse(self, content):
+        """Fallback parser when JSON parsing fails"""
+        try:
+            import re
+            # Try to extract basic information using regex
+            layers = []
+            # Look for depth patterns like "0-2m", "2-5m", etc.
+            depth_pattern = r'(\d+(?:\.\d+)?)\s*-\s*(\d+(?:\.\d+)?)m?\s*[:|]?\s*([^,\n]+)'
+            matches = re.findall(depth_pattern, content, re.IGNORECASE)
+            for i, match in enumerate(matches):
+                depth_from = float(match[0])
+                depth_to = float(match[1])
+                description = match[2].strip()
+                # Extract soil type from description
+                soil_type = "unknown"
+                if "clay" in description.lower():
+                    if "soft" in description.lower():
+                        soil_type = "soft clay"
+                    elif "stiff" in description.lower():
+                        soil_type = "stiff clay"
+                    else:
+                        soil_type = "medium clay"
+                elif "sand" in description.lower():
+                    if "loose" in description.lower():
+                        soil_type = "loose sand"
+                    elif "dense" in description.lower():
+                        soil_type = "dense sand"
+                    else:
+                        soil_type = "medium dense sand"
+                layers.append({
+                    "layer_id": i + 1,
+                    "depth_from": depth_from,
+                    "depth_to": depth_to,
+                    "soil_type": soil_type,
+                    "description": description,
+                    "strength_parameter": "Su" if "clay" in soil_type else "SPT-N",
+                    "strength_value": 50,  # Default value
+                    "color": "unknown",
+                    "moisture": "unknown",
+                    "consistency": "unknown"
+                })
+            return {
+                "project_info": {
+                    "project_name": "Unknown",
+                    "boring_id": "Unknown",
+                    "location": "Unknown",
+                    "date": "Unknown",
+                    "depth_total": max([layer["depth_to"] for layer in layers]) if layers else 0
+                },
+                "soil_layers": layers,
+                "water_table": {"depth": None, "date_encountered": None},
+                "notes": "Parsed using fallback method - original response: " + content[:200] + "..."
+            }
+        except Exception as e:
+            return {"error": f"Fallback parsing failed: {str(e)}", "raw_response": content}
+    def _enforce_layer_grouping_rules(self, layers):
+        """Enforce 7-layer maximum and clay/sand separation rules"""
+        if not layers or len(layers) <= 7:
+            return layers
+        st.info(f"📊 Grouping layers: {len(layers)} layers found, grouping to meet 7-layer limit")
+        # Group similar adjacent layers to reduce count to 7 or fewer
+        grouped_layers = []
+        i = 0
+        while i < len(layers) and len(grouped_layers) < 7:
+            current_layer = layers[i].copy()
+            # Check if we can group with next layers
+            if i < len(layers) - 1 and len(grouped_layers) < 6:  # Leave room for at least one more layer
+                next_layer = layers[i + 1]
+                # Group if same soil type and similar consistency (but never clay with sand)
+                can_group = (
+                    current_layer.get('soil_type') == next_layer.get('soil_type') and
+                    current_layer.get('consistency') == next_layer.get('consistency') and
+                    not (current_layer.get('soil_type') == 'clay' and next_layer.get('soil_type') == 'sand') and
+                    not (current_layer.get('soil_type') == 'sand' and next_layer.get('soil_type') == 'clay')
+                )
+                if can_group:
+                    # Merge the layers
+                    current_layer['depth_to'] = next_layer.get('depth_to', current_layer['depth_to'])
+                    current_layer['description'] = f"Grouped: {current_layer.get('description', '')} + {next_layer.get('description', '')}"
+                    # Average strength values
+                    curr_strength = current_layer.get('strength_value', 0) or 0
+                    next_strength = next_layer.get('strength_value', 0) or 0
+                    if curr_strength and next_strength:
+                        current_layer['strength_value'] = (curr_strength + next_strength) / 2
+                    elif next_strength:
+                        current_layer['strength_value'] = next_strength
+                    # Skip next layer since it's been merged
+                    i += 2
+                else:
+                    i += 1
+            else:
+                i += 1
+            grouped_layers.append(current_layer)
+        # If still too many layers, group remaining similar layers into existing ones
+        if i < len(layers):
+            for remaining_layer in layers[i:]:
+                # Find a compatible layer to merge with
+                merged = False
+                for existing_layer in grouped_layers:
+                    if (existing_layer.get('soil_type') == remaining_layer.get('soil_type') and
+                        existing_layer.get('consistency') == remaining_layer.get('consistency')):
+                        existing_layer['depth_to'] = max(existing_layer['depth_to'], remaining_layer.get('depth_to', 0))
+                        existing_layer['description'] += f" + {remaining_layer.get('description', '')}"
+                        merged = True
+                        break
+                if not merged and len(grouped_layers) < 7:
+                    grouped_layers.append(remaining_layer)
+        # Update layer IDs
+        for idx, layer in enumerate(grouped_layers):
+            layer['layer_id'] = idx + 1
+        # Add note about grouping
+        if len(grouped_layers) < len(layers):
+            st.success(f"✅ Grouped {len(layers)} layers into {len(grouped_layers)} layers (7-layer limit)")
+        return grouped_layers[:7]  # Ensure maximum 7 layers
+    def refine_soil_layers(self, soil_data, user_feedback):
+        """Refine soil layer interpretation based on user feedback"""
+        system_prompt = """You are an expert geotechnical engineer. The user has provided feedback on the initial soil boring log analysis.
+        Please refine the soil layer interpretation based on their input and return the updated JSON in the same format."""
+        messages = [
+            {"role": "system", "content": system_prompt},
+            {"role": "user", "content": f"Original analysis: {json.dumps(soil_data, indent=2)}"},
+            {"role": "user", "content": f"User feedback: {user_feedback}"}
+        ]
+        try:
+            response = self.client.chat.completions.create(
+                model=self.model,
+                messages=messages,
+                max_tokens=2000,
+                temperature=0.1
+            )
+            content = response.choices[0].message.content
+            try:
+                if "```json" in content:
+                    json_start = content.find("```json") + 7
+                    json_end = content.find("```", json_start)
+                    json_str = content[json_start:json_end].strip()
+                else:
+                    json_str = content
+                return json.loads(json_str)
+            except json.JSONDecodeError:
+                return {"error": "Invalid JSON response", "raw_response": content}
+        except Exception as e:
+            return {"error": str(e)}

nearest_neighbor_grouping.py ADDED Viewed

	@@ -0,0 +1,297 @@

+import numpy as np
+from sklearn.neighbors import NearestNeighbors
+from sklearn.preprocessing import StandardScaler
+from sklearn.cluster import DBSCAN
+import pandas as pd
+from typing import List, Dict, Any, Tuple
+import streamlit as st
+class NearestNeighborGrouping:
+    def __init__(self):
+        self.scaler = StandardScaler()
+        self.feature_weights = {
+            'depth_mid': 0.05,          # Depth position (less important for similarity)
+            'thickness': 0.05,          # Layer thickness (less important)
+            'soil_type_encoded': 0.35,  # Soil type (most important)
+            'consistency_encoded': 0.30, # Consistency/density (very important)
+            'strength_value': 0.15,     # Strength parameter
+            'moisture_encoded': 0.05,   # Moisture content
+            'color_encoded': 0.05       # Color
+        }
+    def encode_categorical_features(self, layers: List[Dict]) -> pd.DataFrame:
+        """Convert categorical features to numerical for clustering"""
+        # Create DataFrame from layers
+        df_data = []
+        for i, layer in enumerate(layers):
+            layer_data = {
+                'layer_index': i,
+                'layer_id': layer.get('layer_id', i+1),
+                'depth_from': layer.get('depth_from', 0),
+                'depth_to': layer.get('depth_to', 0),
+                'depth_mid': (layer.get('depth_from', 0) + layer.get('depth_to', 0)) / 2,
+                'thickness': layer.get('depth_to', 0) - layer.get('depth_from', 0),
+                'soil_type': layer.get('soil_type', 'unknown').lower(),
+                'consistency': layer.get('consistency', 'unknown').lower(),
+                'strength_value': layer.get('strength_value', 0) or layer.get('calculated_su', 0) or 0,
+                'moisture': layer.get('moisture', 'unknown').lower(),
+                'color': layer.get('color', 'unknown').lower(),
+                'description': layer.get('description', '')
+            }
+            df_data.append(layer_data)
+        df = pd.DataFrame(df_data)
+        # Encode soil types
+        soil_type_mapping = {
+            'clay': 1, 'silt': 2, 'sand': 3, 'gravel': 4, 'rock': 5, 'unknown': 0
+        }
+        df['soil_type_encoded'] = df['soil_type'].map(soil_type_mapping).fillna(0)
+        # Encode consistency/density
+        consistency_mapping = {
+            'very soft': 1, 'soft': 2, 'medium': 3, 'stiff': 4, 'very stiff': 5, 'hard': 6,
+            'very loose': 1, 'loose': 2, 'medium dense': 3, 'dense': 4, 'very dense': 5,
+            'unknown': 0
+        }
+        df['consistency_encoded'] = df['consistency'].map(consistency_mapping).fillna(0)
+        # Encode moisture
+        moisture_mapping = {
+            'dry': 1, 'moist': 2, 'wet': 3, 'saturated': 4, 'unknown': 0
+        }
+        df['moisture_encoded'] = df['moisture'].map(moisture_mapping).fillna(0)
+        # Encode colors (simplified)
+        color_mapping = {
+            'brown': 1, 'gray': 2, 'black': 3, 'red': 4, 'yellow': 5, 'white': 6, 'unknown': 0
+        }
+        df['color_encoded'] = df['color'].map(color_mapping).fillna(0)
+        return df
+    def calculate_layer_similarity(self, df: pd.DataFrame) -> np.ndarray:
+        """Calculate similarity matrix between layers using weighted features"""
+        # Select features for similarity calculation
+        feature_columns = [
+            'depth_mid', 'thickness', 'soil_type_encoded',
+            'consistency_encoded', 'strength_value', 'moisture_encoded', 'color_encoded'
+        ]
+        # Prepare feature matrix
+        features = df[feature_columns].copy()
+        # Handle missing values
+        features = features.fillna(0)
+        # Apply feature weights
+        for col in feature_columns:
+            if col in self.feature_weights:
+                features[col] = features[col] * self.feature_weights[col]
+        # Standardize features
+        features_scaled = self.scaler.fit_transform(features)
+        # Calculate similarity matrix (using negative euclidean distance)
+        from sklearn.metrics.pairwise import euclidean_distances
+        distance_matrix = euclidean_distances(features_scaled)
+        similarity_matrix = 1 / (1 + distance_matrix)  # Convert distance to similarity
+        return similarity_matrix, features_scaled
+    def find_nearest_neighbors(self, df: pd.DataFrame, k: int = 3) -> List[Dict]:
+        """Find k nearest neighbors for each soil layer"""
+        similarity_matrix, features_scaled = self.calculate_layer_similarity(df)
+        # Use NearestNeighbors to find k nearest neighbors
+        nn_model = NearestNeighbors(n_neighbors=min(k+1, len(df)), metric='euclidean')
+        nn_model.fit(features_scaled)
+        distances, indices = nn_model.kneighbors(features_scaled)
+        nearest_neighbors = []
+        for i, (layer_distances, layer_indices) in enumerate(zip(distances, indices)):
+            neighbors = []
+            for j, (dist, idx) in enumerate(zip(layer_distances[1:], layer_indices[1:])):  # Skip self
+                neighbor_info = {
+                    'neighbor_index': int(idx),
+                    'neighbor_id': df.iloc[idx]['layer_id'],
+                    'distance': float(dist),
+                    'similarity_score': float(similarity_matrix[i, idx]),
+                    'soil_type': df.iloc[idx]['soil_type'],
+                    'consistency': df.iloc[idx]['consistency'],
+                    'depth_range': f"{df.iloc[idx]['depth_from']:.1f}-{df.iloc[idx]['depth_to']:.1f}m"
+                }
+                neighbors.append(neighbor_info)
+            layer_nn = {
+                'layer_index': i,
+                'layer_id': df.iloc[i]['layer_id'],
+                'soil_type': df.iloc[i]['soil_type'],
+                'consistency': df.iloc[i]['consistency'],
+                'depth_range': f"{df.iloc[i]['depth_from']:.1f}-{df.iloc[i]['depth_to']:.1f}m",
+                'nearest_neighbors': neighbors
+            }
+            nearest_neighbors.append(layer_nn)
+        return nearest_neighbors
+    def group_similar_layers(self, df: pd.DataFrame, similarity_threshold: float = 0.7) -> List[List[int]]:
+        """Group layers using DBSCAN clustering based on similarity"""
+        similarity_matrix, features_scaled = self.calculate_layer_similarity(df)
+        # Convert similarity to distance for DBSCAN
+        distance_matrix = 1 - similarity_matrix
+        # Use DBSCAN for clustering
+        eps = 1 - similarity_threshold  # Convert similarity threshold to distance
+        clustering = DBSCAN(eps=eps, min_samples=1, metric='precomputed')
+        cluster_labels = clustering.fit_predict(distance_matrix)
+        # Group layers by cluster
+        clusters = {}
+        for i, label in enumerate(cluster_labels):
+            if label not in clusters:
+                clusters[label] = []
+            clusters[label].append(i)
+        # Convert to list of groups, filter out single-layer groups
+        layer_groups = []
+        for cluster_id, layer_indices in clusters.items():
+            if len(layer_indices) > 1:  # Only groups with multiple layers
+                layer_groups.append(layer_indices)
+        return layer_groups, cluster_labels
+    def analyze_group_properties(self, df: pd.DataFrame, group_indices: List[int]) -> Dict:
+        """Analyze properties of a group of similar layers"""
+        group_layers = df.iloc[group_indices]
+        analysis = {
+            'group_size': len(group_indices),
+            'depth_range': {
+                'min': group_layers['depth_from'].min(),
+                'max': group_layers['depth_to'].max(),
+                'total_thickness': group_layers['thickness'].sum()
+            },
+            'soil_types': group_layers['soil_type'].value_counts().to_dict(),
+            'consistencies': group_layers['consistency'].value_counts().to_dict(),
+            'strength_stats': {
+                'mean': group_layers['strength_value'].mean(),
+                'min': group_layers['strength_value'].min(),
+                'max': group_layers['strength_value'].max(),
+                'std': group_layers['strength_value'].std()
+            },
+            'layer_ids': group_layers['layer_id'].tolist(),
+            'depth_ranges': [f"{row['depth_from']:.1f}-{row['depth_to']:.1f}m"
+                           for _, row in group_layers.iterrows()]
+        }
+        return analysis
+    def suggest_layer_merging(self, layers: List[Dict], similarity_threshold: float = 0.8) -> Dict:
+        """Suggest which layers should be merged based on nearest neighbor analysis"""
+        if len(layers) < 2:
+            return {"groups": [], "recommendations": []}
+        # Encode features
+        df = self.encode_categorical_features(layers)
+        # Find similar layer groups
+        layer_groups, cluster_labels = self.group_similar_layers(df, similarity_threshold)
+        # Analyze each group
+        group_analyses = []
+        recommendations = []
+        for i, group_indices in enumerate(layer_groups):
+            group_analysis = self.analyze_group_properties(df, group_indices)
+            group_analysis['group_id'] = i + 1
+            group_analyses.append(group_analysis)
+            # Check if layers are adjacent or close
+            group_df = df.iloc[group_indices].sort_values('depth_from')
+            is_adjacent = self._check_adjacency(group_df)
+            if is_adjacent:
+                dominant_soil_type = max(group_analysis['soil_types'].items(), key=lambda x: x[1])[0]
+                dominant_consistency = max(group_analysis['consistencies'].items(), key=lambda x: x[1])[0]
+                recommendation = {
+                    'group_id': i + 1,
+                    'action': 'merge',
+                    'reason': f'Similar {dominant_consistency} {dominant_soil_type} layers in adjacent depths',
+                    'layer_ids': group_analysis['layer_ids'],
+                    'depth_ranges': group_analysis['depth_ranges'],
+                    'merged_properties': {
+                        'soil_type': dominant_soil_type,
+                        'consistency': dominant_consistency,
+                        'depth_from': group_analysis['depth_range']['min'],
+                        'depth_to': group_analysis['depth_range']['max'],
+                        'thickness': group_analysis['depth_range']['total_thickness'],
+                        'avg_strength': group_analysis['strength_stats']['mean']
+                    }
+                }
+                recommendations.append(recommendation)
+        return {
+            'groups': group_analyses,
+            'recommendations': recommendations,
+            'cluster_labels': cluster_labels.tolist()
+        }
+    def _check_adjacency(self, group_df: pd.DataFrame, max_gap: float = 0.5) -> bool:
+        """Check if layers in group are adjacent or nearly adjacent"""
+        if len(group_df) <= 1:
+            return True
+        # Sort by depth
+        sorted_df = group_df.sort_values('depth_from')
+        # Check gaps between consecutive layers
+        for i in range(len(sorted_df) - 1):
+            current_end = sorted_df.iloc[i]['depth_to']
+            next_start = sorted_df.iloc[i + 1]['depth_from']
+            gap = next_start - current_end
+            if gap > max_gap:
+                return False
+        return True
+    def get_layer_neighbors_report(self, layers: List[Dict], k: int = 3) -> str:
+        """Generate a detailed report of nearest neighbors for each layer"""
+        if len(layers) < 2:
+            return "Insufficient layers for neighbor analysis."
+        df = self.encode_categorical_features(layers)
+        nearest_neighbors = self.find_nearest_neighbors(df, k)
+        report_lines = [
+            "NEAREST NEIGHBOR ANALYSIS REPORT",
+            "=" * 50,
+            ""
+        ]
+        for layer_info in nearest_neighbors:
+            report_lines.append(f"Layer {layer_info['layer_id']}: {layer_info['consistency']} {layer_info['soil_type']} ({layer_info['depth_range']})")
+            report_lines.append("  Nearest Neighbors:")
+            for i, neighbor in enumerate(layer_info['nearest_neighbors'][:k], 1):
+                similarity_pct = neighbor['similarity_score'] * 100
+                report_lines.append(
+                    f"    {i}. Layer {neighbor['neighbor_id']}: {neighbor['consistency']} {neighbor['soil_type']} "
+                    f"({neighbor['depth_range']}) - Similarity: {similarity_pct:.1f}%"
+                )
+            report_lines.append("")
+        return "\n".join(report_lines)

packages.txt ADDED Viewed

	@@ -0,0 +1,4 @@

+poppler-utils
+tesseract-ocr
+libgl1-mesa-glx
+libglib2.0-0

requirements.txt CHANGED Viewed

@@ -1,3 +1,19 @@
-altair
-pandas
-streamlit

+streamlit>=1.28.0
+openai>=1.3.0
+PyPDF2>=3.0.0
+pdf2image>=1.16.0
+Pillow>=10.0.0
+matplotlib>=3.8.0
+plotly>=5.17.0
+pandas>=2.1.0
+numpy>=1.24.0
+langgraph>=0.0.20
+langchain>=0.1.0
+langchain-core>=0.1.0
+langchain-openai>=0.0.5
+python-dotenv>=1.0.0
+scikit-learn>=1.3.0
+crewai>=0.22.0
+crewai-tools>=0.4.0
+typing-extensions>=4.8.0
+pydantic>=2.0.0

soil_analyzer.py ADDED Viewed

	@@ -0,0 +1,305 @@

+import numpy as np
+from typing import List, Dict, Any
+import streamlit as st
+from nearest_neighbor_grouping import NearestNeighborGrouping
+class SoilLayerAnalyzer:
+    def __init__(self):
+        self.consistency_mapping = {
+            "soft": 1, "loose": 1,
+            "medium": 2, "medium dense": 2,
+            "stiff": 3, "dense": 3,
+            "very stiff": 4, "very dense": 4,
+            "hard": 5
+        }
+        self.nn_grouping = NearestNeighborGrouping()
+    def validate_layer_continuity(self, layers: List[Dict]) -> List[Dict]:
+        """Validate and fix layer depth continuity"""
+        if not layers:
+            return layers
+        # Sort layers by depth_from
+        sorted_layers = sorted(layers, key=lambda x: x.get("depth_from", 0))
+        validated_layers = []
+        for i, layer in enumerate(sorted_layers):
+            if i == 0:
+                # First layer starts from 0
+                layer["depth_from"] = 0
+            else:
+                # Each layer starts where previous ends
+                layer["depth_from"] = validated_layers[-1]["depth_to"]
+            validated_layers.append(layer)
+        return validated_layers
+    def identify_similar_layers(self, layers: List[Dict], similarity_threshold: float = 0.8) -> List[List[int]]:
+        """Identify layers that could potentially be grouped together"""
+        similar_groups = []
+        for i, layer1 in enumerate(layers):
+            for j, layer2 in enumerate(layers[i+1:], i+1):
+                similarity_score = self._calculate_layer_similarity(layer1, layer2)
+                if similarity_score >= similarity_threshold:
+                    # Check if either layer is already in a group
+                    group_found = False
+                    for group in similar_groups:
+                        if i in group:
+                            if j not in group:
+                                group.append(j)
+                            group_found = True
+                            break
+                        elif j in group:
+                            if i not in group:
+                                group.append(i)
+                            group_found = True
+                            break
+                    if not group_found:
+                        similar_groups.append([i, j])
+        return similar_groups
+    def _calculate_layer_similarity(self, layer1: Dict, layer2: Dict) -> float:
+        """Calculate similarity score between two layers"""
+        score = 0.0
+        total_weight = 0.0
+        # Soil type similarity (weight: 0.4)
+        if layer1.get("soil_type", "").lower() == layer2.get("soil_type", "").lower():
+            score += 0.4
+        total_weight += 0.4
+        # Strength parameter similarity (weight: 0.3)
+        strength1 = layer1.get("strength_value")
+        strength2 = layer2.get("strength_value")
+        if strength1 is not None and strength2 is not None:
+            if abs(strength1 - strength2) / max(strength1, strength2) < 0.3:
+                score += 0.3
+        total_weight += 0.3
+        # Consistency similarity (weight: 0.2)
+        consistency1 = self._extract_consistency(layer1.get("soil_type", ""))
+        consistency2 = self._extract_consistency(layer2.get("soil_type", ""))
+        if consistency1 == consistency2:
+            score += 0.2
+        total_weight += 0.2
+        # Color similarity (weight: 0.1)
+        color1 = layer1.get("color") or ""
+        color2 = layer2.get("color") or ""
+        if color1.lower() == color2.lower():
+            score += 0.1
+        total_weight += 0.1
+        return score / total_weight if total_weight > 0 else 0.0
+    def _extract_consistency(self, soil_type: str) -> str:
+        """Extract consistency from soil type description"""
+        soil_type_lower = soil_type.lower()
+        for consistency in self.consistency_mapping.keys():
+            if consistency in soil_type_lower:
+                return consistency
+        return ""
+    def suggest_layer_merging(self, layers: List[Dict]) -> Dict[str, Any]:
+        """Suggest which layers could be merged"""
+        similar_groups = self.identify_similar_layers(layers)
+        suggestions = []
+        for group in similar_groups:
+            if len(group) >= 2:
+                group_layers = [layers[i] for i in group]
+                # Check if layers are adjacent or close
+                depths = [(layer["depth_from"], layer["depth_to"]) for layer in group_layers]
+                depths.sort()
+                # Check for adjacency
+                is_adjacent = True
+                for i in range(len(depths) - 1):
+                    if abs(depths[i][1] - depths[i+1][0]) > 0.5:  # 0.5m tolerance
+                        is_adjacent = False
+                        break
+                if is_adjacent:
+                    suggestions.append({
+                        "layer_indices": group,
+                        "reason": "Similar soil properties and adjacent depths",
+                        "merged_layer": self._create_merged_layer(group_layers)
+                    })
+        return {"suggestions": suggestions}
+    def _create_merged_layer(self, layers: List[Dict]) -> Dict:
+        """Create a merged layer from multiple similar layers"""
+        if not layers:
+            return {}
+        merged = {
+            "layer_id": f"merged_{layers[0]['layer_id']}_{layers[-1]['layer_id']}",
+            "depth_from": min(layer["depth_from"] for layer in layers),
+            "depth_to": max(layer["depth_to"] for layer in layers),
+            "soil_type": layers[0]["soil_type"],  # Use first layer's type
+            "description": f"Merged layer: {', '.join([layer.get('description', '') for layer in layers])}",
+            "strength_parameter": layers[0].get("strength_parameter", ""),
+            "strength_value": np.mean([layer.get("strength_value", 0) for layer in layers if layer.get("strength_value") is not None]),
+            "color": layers[0].get("color", ""),
+            "moisture": layers[0].get("moisture", ""),
+            "consistency": layers[0].get("consistency", "")
+        }
+        return merged
+    def suggest_layer_splitting(self, layers: List[Dict]) -> Dict[str, Any]:
+        """Suggest which layers should be split based on thickness and variability"""
+        suggestions = []
+        for i, layer in enumerate(layers):
+            thickness = layer["depth_to"] - layer["depth_from"]
+            # Suggest splitting very thick layers (>5m)
+            if thickness > 5.0:
+                suggested_splits = int(thickness / 2.5)  # Split into ~2.5m sublayers
+                suggestions.append({
+                    "layer_index": i,
+                    "reason": f"Layer is very thick ({thickness:.1f}m) - consider splitting into {suggested_splits} sublayers",
+                    "suggested_depths": np.linspace(layer["depth_from"], layer["depth_to"], suggested_splits + 1).tolist()
+                })
+            # Check for significant strength variation indication
+            description = layer.get("description", "").lower()
+            if any(word in description for word in ["varying", "variable", "interbedded", "alternating"]):
+                suggestions.append({
+                    "layer_index": i,
+                    "reason": "Description indicates variable conditions - consider splitting based on detailed log",
+                    "suggested_depths": [layer["depth_from"], (layer["depth_from"] + layer["depth_to"])/2, layer["depth_to"]]
+                })
+        return {"suggestions": suggestions}
+    def optimize_layer_division(self, layers: List[Dict], merge_similar=True, split_thick=True) -> Dict[str, Any]:
+        """Optimize layer division by merging similar layers and splitting thick ones"""
+        optimized_layers = layers.copy()
+        changes_made = []
+        # Traditional merge suggestions
+        merge_suggestions = {"suggestions": []}
+        if merge_similar:
+            merge_suggestions = self.suggest_layer_merging(optimized_layers)
+            for suggestion in merge_suggestions["suggestions"]:
+                changes_made.append(f"Merged layers {suggestion['layer_indices']}: {suggestion['reason']}")
+        # Nearest neighbor analysis
+        nn_analysis = self.analyze_nearest_neighbors(optimized_layers)
+        # Split suggestions
+        split_suggestions = {"suggestions": []}
+        if split_thick:
+            split_suggestions = self.suggest_layer_splitting(optimized_layers)
+            for suggestion in split_suggestions["suggestions"]:
+                changes_made.append(f"Suggested splitting layer {suggestion['layer_index']}: {suggestion['reason']}")
+        return {
+            "optimized_layers": optimized_layers,
+            "changes_made": changes_made,
+            "merge_suggestions": merge_suggestions,
+            "split_suggestions": split_suggestions,
+            "nearest_neighbor_analysis": nn_analysis
+        }
+    def analyze_nearest_neighbors(self, layers: List[Dict], k: int = 3, similarity_threshold: float = 0.55) -> Dict[str, Any]:
+        """Perform nearest neighbor analysis on soil layers"""
+        if len(layers) < 2:
+            return {"message": "Insufficient layers for neighbor analysis"}
+        try:
+            # Get nearest neighbor analysis
+            nn_suggestions = self.nn_grouping.suggest_layer_merging(layers, similarity_threshold)
+            # Get detailed neighbor report
+            neighbor_report = self.nn_grouping.get_layer_neighbors_report(layers, k)
+            return {
+                "neighbor_groups": nn_suggestions.get("groups", []),
+                "merge_recommendations": nn_suggestions.get("recommendations", []),
+                "cluster_labels": nn_suggestions.get("cluster_labels", []),
+                "neighbor_report": neighbor_report,
+                "analysis_parameters": {
+                    "similarity_threshold": similarity_threshold,
+                    "k_neighbors": k,
+                    "total_layers": len(layers)
+                }
+            }
+        except Exception as e:
+            st.error(f"Error in nearest neighbor analysis: {str(e)}")
+            return {"error": str(e)}
+    def get_grouping_summary(self, layers: List[Dict]) -> Dict[str, Any]:
+        """Get a comprehensive summary of layer grouping analysis"""
+        nn_analysis = self.analyze_nearest_neighbors(layers)
+        if "error" in nn_analysis:
+            return nn_analysis
+        summary = {
+            "total_layers": len(layers),
+            "identified_groups": len(nn_analysis.get("neighbor_groups", [])),
+            "merge_recommendations": len(nn_analysis.get("merge_recommendations", [])),
+            "group_details": []
+        }
+        # Add details for each group
+        for i, group in enumerate(nn_analysis.get("neighbor_groups", [])):
+            group_detail = {
+                "group_id": group.get("group_id", i+1),
+                "layers_in_group": group.get("group_size", 0),
+                "depth_range": f"{group.get('depth_range', {}).get('min', 0):.1f}-{group.get('depth_range', {}).get('max', 0):.1f}m",
+                "total_thickness": group.get('depth_range', {}).get('total_thickness', 0),
+                "dominant_soil_type": max(group.get('soil_types', {}).items(), key=lambda x: x[1])[0] if group.get('soil_types') else "unknown",
+                "layer_ids": group.get("layer_ids", [])
+            }
+            summary["group_details"].append(group_detail)
+        return summary
+    def calculate_layer_statistics(self, layers: List[Dict]) -> Dict[str, Any]:
+        """Calculate statistics for the soil profile"""
+        if not layers:
+            return {}
+        total_depth = max(layer["depth_to"] for layer in layers)
+        layer_count = len(layers)
+        # Soil type distribution
+        soil_types = {}
+        for layer in layers:
+            soil_type = layer.get("soil_type", "unknown")
+            thickness = layer["depth_to"] - layer["depth_from"]
+            if soil_type in soil_types:
+                soil_types[soil_type] += thickness
+            else:
+                soil_types[soil_type] = thickness
+        # Convert to percentages
+        soil_type_percentages = {k: (v/total_depth)*100 for k, v in soil_types.items()}
+        # Average layer thickness
+        thicknesses = [layer["depth_to"] - layer["depth_from"] for layer in layers]
+        avg_thickness = np.mean(thicknesses)
+        return {
+            "total_depth": total_depth,
+            "layer_count": layer_count,
+            "average_layer_thickness": avg_thickness,
+            "soil_type_distribution": soil_type_percentages,
+            "thickest_layer": max(thicknesses),
+            "thinnest_layer": min(thicknesses)
+        }

soil_boring_analyzer_hf_ready.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0945b4363e97b930f6cbaa9913888b5d63329830f6a6c05d3aa18be194e92f3d
+size 69885

soil_calculations.py ADDED Viewed

	@@ -0,0 +1,350 @@

+import numpy as np
+import re
+import streamlit as st
+from typing import Dict, List, Any, Tuple
+class SoilCalculations:
+    def __init__(self):
+        # Peck correlation coefficients for friction angle calculation
+        self.peck_coefficients = {
+            "fine_sand": {"a": 27.1, "b": 0.3},
+            "medium_sand": {"a": 27.1, "b": 0.3},
+            "coarse_sand": {"a": 27.1, "b": 0.3},
+            "silty_sand": {"a": 25.4, "b": 0.3},
+            "clayey_sand": {"a": 25.4, "b": 0.3}
+        }
+    def calculate_su_from_n(self, n_value: float, correlation_factor: float = 5.0) -> float:
+        """Calculate undrained shear strength from SPT-N value for clay
+        Su = correlation_factor * N (typically 5-7 for clay)"""
+        if n_value is None or n_value <= 0:
+            return None
+        return correlation_factor * n_value
+    def calculate_friction_angle_peck(self, n_value: float, sand_type: str = "medium_sand",
+                                    effective_stress: float = 100.0) -> float:
+        """Calculate friction angle using Peck correlation
+        φ = a + b * log10(N60) where N60 is corrected SPT value"""
+        if n_value is None or n_value <= 0:
+            return None
+        # Apply overburden correction (simplified)
+        n60_corrected = n_value * (effective_stress / 100.0) ** 0.5
+        n60_corrected = min(n60_corrected, 50)  # Cap at reasonable value
+        coeffs = self.peck_coefficients.get(sand_type, self.peck_coefficients["medium_sand"])
+        friction_angle = coeffs["a"] + coeffs["b"] * np.log10(max(n60_corrected, 1))
+        return min(friction_angle, 45)  # Cap at reasonable maximum
+    def classify_soil_consistency(self, soil_type: str, n_value: float = None, su_value: float = None) -> str:
+        """Classify soil consistency based on strength parameters"""
+        if "clay" in soil_type.lower() or "silt" in soil_type.lower():
+            # Use Su for clay classification
+            if su_value is not None:
+                if su_value < 25:
+                    return "very soft"
+                elif su_value < 50:
+                    return "soft"
+                elif su_value < 100:
+                    return "medium"
+                elif su_value < 200:
+                    return "stiff"
+                elif su_value < 400:
+                    return "very stiff"
+                else:
+                    return "hard"
+            # Use N-value for clay if Su not available
+            elif n_value is not None:
+                if n_value < 2:
+                    return "very soft"
+                elif n_value < 4:
+                    return "soft"
+                elif n_value < 8:
+                    return "medium"
+                elif n_value < 15:
+                    return "stiff"
+                elif n_value < 30:
+                    return "very stiff"
+                else:
+                    return "hard"
+        elif "sand" in soil_type.lower() or "gravel" in soil_type.lower():
+            # Use N-value for sand classification
+            if n_value is not None:
+                if n_value < 4:
+                    return "very loose"
+                elif n_value < 10:
+                    return "loose"
+                elif n_value < 30:
+                    return "medium dense"
+                elif n_value < 50:
+                    return "dense"
+                else:
+                    return "very dense"
+        return "unknown"
+    def standardize_units(self, text: str) -> Tuple[str, Dict[str, str]]:
+        """Standardize units in soil boring log text before LLM processing"""
+        unit_conversions = {}
+        standardized_text = text
+        # Convert feet to meters
+        feet_pattern = r'(\d+(?:\.\d+)?)\s*(?:ft|feet|\')'
+        feet_matches = re.findall(feet_pattern, standardized_text, re.IGNORECASE)
+        for match in feet_matches:
+            feet_value = float(match)
+            meters_value = feet_value * 0.3048
+            old_text = f"{match}ft"
+            new_text = f"{meters_value:.1f}m"
+            standardized_text = standardized_text.replace(old_text, new_text)
+            unit_conversions[old_text] = new_text
+        # Convert psf to kPa
+        psf_pattern = r'(\d+(?:\.\d+)?)\s*(?:psf|lbs?/ft²?)'
+        psf_matches = re.findall(psf_pattern, standardized_text, re.IGNORECASE)
+        for match in psf_matches:
+            psf_value = float(match)
+            kpa_value = psf_value * 0.047880259
+            old_text = f"{match}psf"
+            new_text = f"{kpa_value:.1f}kPa"
+            standardized_text = standardized_text.replace(old_text, new_text)
+            unit_conversions[old_text] = new_text
+        # Convert psi to kPa
+        psi_pattern = r'(\d+(?:\.\d+)?)\s*(?:psi|lbs?/in²?)'
+        psi_matches = re.findall(psi_pattern, standardized_text, re.IGNORECASE)
+        for match in psi_matches:
+            psi_value = float(match)
+            kpa_value = psi_value * 6.89476
+            old_text = f"{match}psi"
+            new_text = f"{kpa_value:.1f}kPa"
+            standardized_text = standardized_text.replace(old_text, new_text)
+            unit_conversions[old_text] = new_text
+        # Convert ksc (kg/cm²) to kPa
+        ksc_pattern = r'(\d+(?:\.\d+)?)\s*(?:ksc|kg/cm²?|kg/cm2)'
+        ksc_matches = re.findall(ksc_pattern, standardized_text, re.IGNORECASE)
+        for match in ksc_matches:
+            ksc_value = float(match)
+            kpa_value = ksc_value * 98.0
+            old_text = f"{match}ksc"
+            new_text = f"{kpa_value:.1f}kPa"
+            standardized_text = standardized_text.replace(old_text, new_text)
+            unit_conversions[old_text] = new_text
+        # Convert t/m² (tonnes per square meter) to kPa
+        tonnes_pattern = r'(\d+(?:\.\d+)?)\s*(?:t/m²?|ton/m²?|tonnes?/m²?|tonne/m²?)'
+        tonnes_matches = re.findall(tonnes_pattern, standardized_text, re.IGNORECASE)
+        for match in tonnes_matches:
+            tonnes_value = float(match)
+            kpa_value = tonnes_value * 9.81
+            old_text = f"{match}t/m²"
+            new_text = f"{kpa_value:.1f}kPa"
+            standardized_text = standardized_text.replace(old_text, new_text)
+            unit_conversions[old_text] = new_text
+        # Standardize depth notation
+        depth_pattern = r'(\d+(?:\.\d+)?)\s*-\s*(\d+(?:\.\d+)?)\s*(?:ft|feet|\')'
+        standardized_text = re.sub(depth_pattern,
+                                 lambda m: f"{float(m.group(1))*0.3048:.1f}-{float(m.group(2))*0.3048:.1f}m",
+                                 standardized_text, flags=re.IGNORECASE)
+        return standardized_text, unit_conversions
+    def enhance_soil_layers(self, soil_layers: List[Dict]) -> List[Dict]:
+        """Enhance soil layers with calculated parameters"""
+        enhanced_layers = []
+        for layer in soil_layers:
+            enhanced_layer = layer.copy()
+            # Extract values
+            n_value = layer.get("strength_value") if layer.get("strength_parameter") == "SPT-N" else None
+            su_value = layer.get("strength_value") if layer.get("strength_parameter") == "Su" else None
+            soil_type = layer.get("soil_type", "").lower()
+            depth_from = layer.get("depth_from", 0)
+            depth_to = layer.get("depth_to", 0)
+            sample_type = layer.get("sample_type", "")
+            su_source = layer.get("su_source", "")
+            # CRITICAL RULE: For SS samples, ALWAYS use Su=5*N calculation, IGNORE unconfined compression Su
+            if sample_type == "SS" and "clay" in soil_type:
+                # For SS samples, we MUST use N-value to calculate Su, regardless of any other Su data
+                if n_value is not None:
+                    calculated_su = self.calculate_su_from_n(n_value)
+                    enhanced_layer["strength_parameter"] = "Su"
+                    enhanced_layer["strength_value"] = calculated_su
+                    enhanced_layer["su_source"] = f"SS Sample: Calculated from raw N={n_value} (Su=5*N)"
+                    enhanced_layer["original_spt"] = n_value
+                    # Override any existing Su values for SS samples
+                    if su_value is not None and su_value != calculated_su:
+                        enhanced_layer["ignored_unconfined_su"] = su_value
+                        st.warning(f"⚠️ SS Sample Layer {enhanced_layer.get('layer_id', 'Unknown')}: Ignored unconfined Su={su_value:.0f}, using calculated Su={calculated_su:.0f} kPa from N={n_value}")
+                    st.success(f"✅ SS Sample Layer {enhanced_layer.get('layer_id', 'Unknown')}: Su = 5 × {n_value} = {calculated_su:.0f} kPa")
+                else:
+                    st.error(f"❌ SS Sample Layer {enhanced_layer.get('layer_id', 'Unknown')}: No N-value found for Su calculation")
+            # For ST samples, preserve direct Su measurements
+            elif sample_type == "ST" and su_value is not None:
+                enhanced_layer["su_source"] = su_source or "ST Sample: Direct measurement from Unconfined Compression Test"
+                st.success(f"✅ ST Sample Layer {enhanced_layer.get('layer_id', 'Unknown')}: Using direct Su={su_value:.0f} kPa")
+            # For other cases (no sample type specified), use previous logic but prioritize sample identification
+            else:
+                # Try to identify sample type from available data
+                if n_value is not None and su_value is None and "clay" in soil_type:
+                    # Only calculate Su from N-value if no direct Su available (likely SS sample)
+                    calculated_su = self.calculate_su_from_n(n_value)
+                    enhanced_layer["calculated_su"] = calculated_su
+                    enhanced_layer["su_source"] = f"Calculated from N={n_value} (Su=5*N) - assumed SS sample"
+                    st.info(f"🔬 Layer {enhanced_layer.get('layer_id', 'Unknown')}: Calculated Su={calculated_su:.0f} kPa from N={n_value} (assumed SS)")
+                elif su_value is not None:
+                    # Preserve direct Su values (likely ST sample)
+                    enhanced_layer["su_source"] = su_source or "Direct measurement - assumed ST sample"
+                    st.success(f"✅ Layer {enhanced_layer.get('layer_id', 'Unknown')}: Using direct Su={su_value:.0f} kPa (assumed ST)")
+            # Handle sand/silt friction angle calculation
+            if "sand" in soil_type and n_value is not None:
+                # Calculate friction angle for sand
+                mid_depth = (depth_from + depth_to) / 2
+                effective_stress = 20 * mid_depth  # Approximate effective stress (kPa)
+                sand_type_classification = "medium_sand"
+                if "fine" in soil_type:
+                    sand_type_classification = "fine_sand"
+                elif "coarse" in soil_type:
+                    sand_type_classification = "coarse_sand"
+                elif "silt" in soil_type:
+                    sand_type_classification = "silty_sand"
+                friction_angle = self.calculate_friction_angle_peck(
+                    n_value, sand_type_classification, effective_stress
+                )
+                enhanced_layer["friction_angle"] = friction_angle
+                enhanced_layer["friction_angle_source"] = f"Peck method from raw N={n_value}"
+                if sample_type == "SS":
+                    st.success(f"✅ SS Sample Layer {enhanced_layer.get('layer_id', 'Unknown')}: φ = {friction_angle:.1f}° from N={n_value}")
+                else:
+                    st.info(f"📊 Layer {enhanced_layer.get('layer_id', 'Unknown')}: φ = {friction_angle:.1f}° from N={n_value}")
+            # Update consistency classification
+            consistency = self.classify_soil_consistency(soil_type, n_value, su_value)
+            if consistency != "unknown":
+                enhanced_layer["consistency"] = consistency
+                # Keep soil_type as basic type (clay, sand, silt)
+                base_soil = "clay" if "clay" in soil_type else \
+                           "sand" if "sand" in soil_type else \
+                           "silt" if "silt" in soil_type else \
+                           "gravel" if "gravel" in soil_type else soil_type
+                # Remove any existing consistency terms from soil_type
+                for consistency_term in ["very soft", "soft", "medium", "stiff", "very stiff", "hard",
+                                       "very loose", "loose", "medium dense", "dense", "very dense"]:
+                    base_soil = base_soil.replace(consistency_term, "").strip()
+                enhanced_layer["soil_type"] = base_soil
+            enhanced_layers.append(enhanced_layer)
+        return enhanced_layers
+    def validate_soil_classification(self, soil_data: Dict) -> Dict:
+        """Validate and improve soil classification"""
+        if "soil_layers" not in soil_data:
+            return soil_data
+        layers = soil_data["soil_layers"]
+        validated_layers = []
+        for layer in layers:
+            validated_layer = layer.copy()
+            # Check consistency between soil type and strength parameters
+            soil_type = layer.get("soil_type", "").lower()
+            strength_param = layer.get("strength_parameter", "")
+            strength_value = layer.get("strength_value")
+            # Fix parameter mismatches
+            if "clay" in soil_type and strength_param == "SPT-N" and strength_value:
+                # Clay should use Su, but if only N is available, calculate Su
+                calculated_su = self.calculate_su_from_n(strength_value)
+                validated_layer["calculated_su"] = calculated_su
+                validated_layer["su_source"] = f"Calculated from N={strength_value}"
+            elif "sand" in soil_type and strength_param == "Su":
+                # Sand should not have Su parameter
+                validated_layer["strength_parameter"] = "SPT-N"
+                validated_layer["parameter_note"] = "Corrected from Su to SPT-N for sand"
+            # Validate depth ranges
+            if validated_layer.get("depth_from") >= validated_layer.get("depth_to"):
+                # Fix invalid depth ranges
+                depth_from = validated_layer.get("depth_from", 0)
+                validated_layer["depth_to"] = depth_from + 1.0  # Default 1m thickness
+                validated_layer["depth_note"] = "Corrected invalid depth range"
+            validated_layers.append(validated_layer)
+        soil_data["soil_layers"] = validated_layers
+        return soil_data
+    def process_with_ss_st_classification(self, soil_data: Dict[str, Any]) -> Dict[str, Any]:
+        """
+        Process soil data with SS/ST sample classification
+        """
+        try:
+            from soil_classification import SoilClassificationProcessor
+            if "soil_layers" not in soil_data:
+                return soil_data
+            # Initialize the enhanced processor
+            processor = SoilClassificationProcessor()
+            # Process layers with SS/ST classification
+            enhanced_layers = processor.process_soil_layers(soil_data["soil_layers"])
+            # Update soil data
+            soil_data["soil_layers"] = enhanced_layers
+            # Add processing summary
+            processing_summary = processor.get_processing_summary(enhanced_layers)
+            soil_data["processing_summary"] = processing_summary
+            # Display processing summary
+            st.subheader("📊 SS/ST Processing Summary")
+            col1, col2, col3, col4 = st.columns(4)
+            with col1:
+                st.metric("Total Layers", processing_summary['total_layers'])
+                st.metric("ST Samples", processing_summary['st_samples'])
+            with col2:
+                st.metric("SS Samples", processing_summary['ss_samples'])
+                st.metric("Clay Layers", processing_summary['clay_layers'])
+            with col3:
+                st.metric("Sand/Silt Layers", processing_summary['sand_layers'])
+                st.metric("Su Calculated", processing_summary['su_calculated'])
+            with col4:
+                st.metric("φ Calculated", processing_summary['phi_calculated'])
+            return soil_data
+        except ImportError as e:
+            st.warning(f"⚠️ Enhanced SS/ST classification not available: {str(e)}")
+            return soil_data
+        except Exception as e:
+            st.error(f"❌ Error in SS/ST processing: {str(e)}")
+            return soil_data

soil_classification.py ADDED Viewed

	@@ -0,0 +1,1434 @@

+import re
+import numpy as np
+import streamlit as st
+from typing import Dict, List, Any, Tuple, Optional
+class SoilClassificationProcessor:
+    """
+    Advanced soil classification processor that handles SS and ST samples
+    with proper unit conversions and soil parameter calculations
+    """
+    def __init__(self):
+        # Enhanced unit conversion factors to SI units
+        self.unit_conversions = {
+            # Pressure/Stress units to kPa
+            'psi': 6.895,
+            'psf': 0.04788,
+            'kpa': 1.0,
+            'kn/m2': 1.0,
+            'kn/m²': 1.0,
+            'knm2': 1.0,
+            'mpa': 1000.0,
+            'pa': 0.001,
+            'n/m2': 0.001,
+            'n/m²': 0.001,
+            'nm2': 0.001,
+            'ksf': 47.88,
+            'tsf': 95.76,
+            'kg/cm2': 98.0,
+            'kg/cm²': 98.0,
+            'kgcm2': 98.0,
+            'ksc': 98.0,  # kilograms per square centimeter (same as kg/cm²)
+            'bar': 100.0,
+            'atm': 101.325,  # atmosphere to kPa
+            'mmhg': 0.133322,  # mmHg to kPa
+            'inhg': 3.386,     # inHg to kPa
+            # Enhanced tonnes/tons per square meter conversions
+            't/m2': 9.81,     # tonnes per square meter to kPa
+            't/m²': 9.81,     # tonnes per square meter to kPa
+            'tm2': 9.81,      # tm2 variant
+            'ton/m2': 9.81,   # ton per square meter to kPa
+            'ton/m²': 9.81,   # ton per square meter to kPa
+            'tonm2': 9.81,    # tonm2 variant
+            'tonnes/m2': 9.81, # tonnes per square meter to kPa
+            'tonnes/m²': 9.81, # tonnes per square meter to kPa
+            'tonnesm2': 9.81,  # tonnesm2 variant
+            'tonne/m2': 9.81,  # tonne per square meter to kPa
+            'tonne/m²': 9.81,  # tonne per square meter to kPa
+            'tonnem2': 9.81,   # tonnem2 variant
+            # Additional international pressure units
+            'kgf/cm2': 98.0,   # kilogram-force per cm²
+            'kgf/cm²': 98.0,   # kilogram-force per cm²
+            'kgfcm2': 98.0,    # variant without symbols
+            'lbf/in2': 6.895,  # pound-force per square inch (same as psi)
+            'lbf/ft2': 0.04788, # pound-force per square foot (same as psf)
+            'lbfin2': 6.895,   # variant without symbols
+            'lbfft2': 0.04788, # variant without symbols
+            # Length units to meters (enhanced)
+            'ft': 0.3048,
+            'feet': 0.3048,
+            'foot': 0.3048,
+            "'": 0.3048,       # foot symbol
+            'in': 0.0254,
+            'inch': 0.0254,
+            'inches': 0.0254,
+            '"': 0.0254,       # inch symbol
+            'cm': 0.01,
+            'mm': 0.001,
+            'km': 1000.0,
+            'm': 1.0,
+            'meter': 1.0,
+            'metre': 1.0,
+            'meters': 1.0,
+            'metres': 1.0,
+            'yd': 0.9144,      # yard to meters
+            'yard': 0.9144,
+            'yards': 0.9144,
+            # Weight/Force units (for completeness)
+            'n': 1.0,          # Newton (SI base)
+            'kn': 1000.0,      # kilonewton to Newton
+            'kgf': 9.81,       # kilogram-force to Newton
+            'lbf': 4.448,      # pound-force to Newton
+            'lb': 4.448,       # pound (assuming force context)
+            'kg': 9.81,        # kilogram (assuming force context, kg*g)
+        }
+        # Soil classification criteria
+        self.sieve_200_threshold = 50.0  # % passing sieve #200 for clay classification
+    def process_soil_layers(self, layers: List[Dict]) -> List[Dict]:
+        """
+        Process soil layers with SS/ST sample classification and parameter calculation
+        """
+        processed_layers = []
+        st.info("🔬 Processing soil layers with SS/ST sample classification...")
+        for i, layer in enumerate(layers):
+            processed_layer = layer.copy()
+            # Step 1: Identify sample type (SS or ST)
+            sample_type = self._identify_sample_type(layer)
+            processed_layer['sample_type'] = sample_type
+            # Step 2: Classify soil type if not already classified
+            soil_type = self._classify_soil_type(layer)
+            processed_layer['soil_type'] = soil_type
+            # Step 3: Process based on sample type
+            if sample_type == 'ST':
+                processed_layer = self._process_st_sample(processed_layer)
+            elif sample_type == 'SS':
+                processed_layer = self._process_ss_sample(processed_layer)
+            else:
+                # Default processing for unidentified samples
+                processed_layer = self._process_default_sample(processed_layer)
+            # Step 4: Ensure all units are in SI
+            processed_layer = self._convert_to_si_units(processed_layer)
+            # Step 5: Validate and add engineering parameters
+            processed_layer = self._add_engineering_parameters(processed_layer)
+            # Step 6: Check clay consistency (water content vs Su)
+            processed_layer = self._check_clay_consistency(processed_layer)
+            processed_layers.append(processed_layer)
+            # Progress feedback
+            st.write(f"   ✅ Layer {i+1}: {sample_type} sample, {soil_type} - {processed_layer.get('strength_parameter', 'N/A')}")
+        st.success(f"✅ Processed {len(processed_layers)} soil layers with SS/ST classification")
+        return processed_layers
+    def _identify_sample_type(self, layer: Dict) -> str:
+        """
+        Identify if sample is Split Spoon (SS) or Shelby Tube (ST)
+        CRITICAL: Look at FIRST COLUMN stratification symbols with ABSOLUTE HIGHEST PRIORITY
+        """
+        description = layer.get('description', '').lower()
+        # ABSOLUTE HIGHEST PRIORITY: Check for first column stratification symbols
+        # Patterns for first column recognition: SS-18, ST-5, SS18, ST3, etc.
+        first_column_patterns = [
+            # High precision patterns for first column symbols
+            r'^[^|]*\b(ss[-]?\d+)\b',     # SS-18, SS18 at start or before pipe
+            r'^[^|]*\b(st[-]?\d+)\b',     # ST-5, ST5 at start or before pipe
+            r'^\s*(ss[-]?\d+)',           # SS-number at very beginning
+            r'^\s*(st[-]?\d+)',           # ST-number at very beginning
+            r'\|(.*?)(ss[-]?\d+)',        # After pipe separator
+            r'\|(.*?)(st[-]?\d+)',        # After pipe separator
+            r'\b(ss[-]?\d+)\s*[|:]',      # SS-number followed by pipe or colon
+            r'\b(st[-]?\d+)\s*[|:]',      # ST-number followed by pipe or colon
+        ]
+        for pattern in first_column_patterns:
+            match = re.search(pattern, description, re.IGNORECASE)
+            if match:
+                # Get the SS/ST part (could be in different groups)
+                matched_groups = [g for g in match.groups() if g and ('ss' in g.lower() or 'st' in g.lower())]
+                if matched_groups:
+                    matched_text = matched_groups[0].lower().strip()
+                    if matched_text.startswith('ss'):
+                        st.success(f"🎯 FIRST COLUMN DETECTED: {matched_text.upper()} → SS sample (HIGHEST PRIORITY)")
+                        return 'SS'
+                    elif matched_text.startswith('st'):
+                        st.success(f"🎯 FIRST COLUMN DETECTED: {matched_text.upper()} → ST sample (HIGHEST PRIORITY)")
+                        return 'ST'
+        # FALLBACK: Check for standalone SS/ST symbols (lower priority)
+        standalone_patterns = [
+            r'\bss\b(?!\w)',        # Just SS (not part of another word)
+            r'\bst\b(?!\w)'         # Just ST (not part of another word)
+        ]
+        for pattern in standalone_patterns:
+            match = re.search(pattern, description, re.IGNORECASE)
+            if match:
+                matched_text = match.group(0).lower()
+                if matched_text == 'ss':
+                    st.info(f"📊 Standalone symbol detected: SS → SS sample")
+                    return 'SS'
+                elif matched_text == 'st':
+                    st.info(f"📊 Standalone symbol detected: ST → ST sample")
+                    return 'ST'
+        # SECOND: Check for keywords in description
+        # Keywords for ST samples
+        st_keywords = ['shelby', 'tube', 'undisturbed', 'ut', 'unconfined', 'uu test', 'ucs']
+        # Keywords for SS samples
+        ss_keywords = ['split spoon', 'spt', 'standard penetration', 'disturbed', 'n-value']
+        # Check for ST indicators
+        if any(keyword in description for keyword in st_keywords):
+            return 'ST'
+        # Check for SS indicators
+        if any(keyword in description for keyword in ss_keywords):
+            return 'SS'
+        # THIRD: Check strength parameter types
+        # Check if SPT-N value is present (indicates SS)
+        if layer.get('strength_parameter') == 'SPT-N' or 'spt' in description:
+            return 'SS'
+        # Check if Su value is present (could indicate ST)
+        if layer.get('strength_parameter') == 'Su' or 'su' in description.lower():
+            return 'ST'
+        # FOURTH: Default assumption based on available data
+        if layer.get('strength_value') and layer.get('strength_value') > 50:
+            return 'SS'  # High values typically SPT-N
+        else:
+            return 'ST'  # Lower values typically Su
+    def _classify_soil_type(self, layer: Dict) -> str:
+        """
+        Enhanced soil type classification with MANDATORY sieve analysis requirement for sand
+        CRITICAL: Sand layers MUST have sieve analysis evidence - otherwise assume clay
+        """
+        # Check if soil type is already specified and validate it
+        existing_type = layer.get('soil_type', '').lower()
+        if existing_type and existing_type != 'unknown':
+            # If it's sand/gravel, verify sieve analysis exists
+            if existing_type in ['sand', 'silt', 'gravel']:
+                sieve_200_passing = self._extract_sieve_200_data(layer)
+                if sieve_200_passing is None:
+                    st.warning(f"⚠️ '{existing_type}' classification without sieve analysis data. OVERRIDING to 'clay' per requirements.")
+                    layer['classification_override'] = f"Changed from '{existing_type}' to 'clay' - no sieve analysis data"
+                    return 'clay'
+                else:
+                    st.success(f"✅ '{existing_type}' classification confirmed with sieve #200: {sieve_200_passing}% passing")
+                    return existing_type
+            else:
+                return existing_type
+        description = layer.get('description', '').lower()
+        # CRITICAL: Check for sieve analysis data FIRST before any classification
+        sieve_200_passing = self._extract_sieve_200_data(layer)
+        if sieve_200_passing is not None:
+            # Sieve analysis data available - use it for classification
+            if sieve_200_passing > self.sieve_200_threshold:
+                classification = 'clay'  # Fine-grained soil
+                st.success(f"✅ Classified as CLAY: {sieve_200_passing}% passing #200 (>50%)")
+            else:
+                classification = 'sand'  # Coarse-grained soil
+                st.success(f"✅ Classified as SAND: {sieve_200_passing}% passing #200 (<50%)")
+            layer['sieve_200_passing'] = sieve_200_passing
+            layer['classification_basis'] = f"Sieve analysis: {sieve_200_passing}% passing #200"
+            return classification
+        # NO SIEVE ANALYSIS DATA - Check for explicit mentions but apply strict rules
+        potential_classifications = []
+        if any(clay_word in description for clay_word in ['clay', 'clayey', 'ch', 'cl']):
+            potential_classifications.append('clay')
+        if any(sand_word in description for sand_word in ['sand', 'sandy', 'sp', 'sw', 'sm', 'sc']):
+            potential_classifications.append('sand')
+        if any(silt_word in description for silt_word in ['silt', 'silty', 'ml', 'mh']):
+            potential_classifications.append('silt')
+        if any(gravel_word in description for gravel_word in ['gravel', 'gp', 'gw', 'gm', 'gc']):
+            potential_classifications.append('gravel')
+        # ENFORCE MANDATORY RULE: No sand/silt/gravel without sieve analysis
+        if any(coarse_type in potential_classifications for coarse_type in ['sand', 'silt', 'gravel']):
+            st.error(f"❌ CRITICAL: Found potential {potential_classifications} classification but NO sieve analysis data!")
+            st.warning(f"🔧 ENFORCING RULE: Classifying as 'clay' - sand/silt/gravel requires sieve analysis evidence")
+            layer['classification_override'] = f"Forced clay classification - found {potential_classifications} terms but no sieve data"
+            layer['sieve_200_passing'] = None
+            layer['classification_basis'] = "Assumed clay - no sieve analysis data available (mandatory requirement)"
+            return 'clay'
+        # Default to clay if only clay terms found or no clear classification
+        if 'clay' in potential_classifications or not potential_classifications:
+            st.info(f"💡 Classified as CLAY: {potential_classifications if potential_classifications else 'No explicit soil type found'}")
+            layer['sieve_200_passing'] = None
+            layer['classification_basis'] = "Assumed clay - no sieve analysis data available"
+            return 'clay'
+        # Final fallback - should not reach here
+        st.warning(f"⚠️ Unclear classification. Defaulting to 'clay' per mandatory requirements.")
+        layer['sieve_200_passing'] = None
+        layer['classification_basis'] = "Default clay classification - unclear soil type and no sieve data"
+        return 'clay'
+    def _extract_sieve_200_data(self, layer: Dict) -> Optional[float]:
+        """
+        Enhanced sieve #200 passing percentage extraction with comprehensive pattern recognition
+        """
+        description = layer.get('description', '')
+        # Enhanced patterns to catch all possible sieve analysis formats
+        patterns = [
+            # Standard #200 sieve patterns
+            r'#200[:\s]*(\d+(?:\.\d+)?)%',
+            r'sieve\s*#?200[:\s]*(\d+(?:\.\d+)?)%',
+            r'no\.?\s*200[:\s]*(\d+(?:\.\d+)?)%',
+            r'passing\s*#?200[:\s]*(\d+(?:\.\d+)?)%',
+            r'(\d+(?:\.\d+)?)%\s*passing\s*#?200',
+            # Fines content (equivalent to #200 passing)
+            r'fines[:\s]*(\d+(?:\.\d+)?)%',
+            r'fine[s]?\s*content[:\s]*(\d+(?:\.\d+)?)%',
+            r'(\d+(?:\.\d+)?)%\s*fines',
+            # 0.075mm equivalent (same as #200)
+            r'0\.075\s*mm[:\s]*(\d+(?:\.\d+)?)%\s*passing',
+            r'(\d+(?:\.\d+)?)%\s*passing\s*0\.075\s*mm',
+            r'0\.075[:\s]*(\d+(?:\.\d+)?)%',
+            # Particle size analysis patterns
+            r'particle\s*size[:\s]*(\d+(?:\.\d+)?)%\s*fines',
+            r'gradation[:\s]*(\d+(?:\.\d+)?)%\s*passing\s*#?200',
+            r'grain\s*size[:\s]*(\d+(?:\.\d+)?)%\s*fines',
+            # Sieve analysis results patterns
+            r'sieve\s*analysis[:\s].*?(\d+(?:\.\d+)?)%\s*passing\s*#?200',
+            r'sieve\s*analysis[:\s].*?#?200[:\s]*(\d+(?:\.\d+)?)%',
+            # ASTM/Standard method references
+            r'astm\s*d422[:\s].*?(\d+(?:\.\d+)?)%\s*passing\s*#?200',
+            r'astm\s*d6913[:\s].*?(\d+(?:\.\d+)?)%\s*passing\s*#?200',
+            # Alternative formats
+            r'(\d+(?:\.\d+)?)%\s*<\s*0\.075\s*mm',  # Percent less than 0.075mm
+            r'minus\s*#?200[:\s]*(\d+(?:\.\d+)?)%',   # Minus #200
+            r'(\d+(?:\.\d+)?)%\s*minus\s*#?200',     # Percent minus #200
+        ]
+        for pattern in patterns:
+            match = re.search(pattern, description, re.IGNORECASE)
+            if match:
+                percentage = float(match.group(1))
+                st.success(f"✅ Found sieve #200 data: {percentage}% passing from '{match.group(0)}'")
+                # Validate percentage range
+                if 0 <= percentage <= 100:
+                    return percentage
+                else:
+                    st.warning(f"⚠️ Invalid percentage ({percentage}%) found. Should be 0-100%.")
+                    return None
+        # Check if explicitly mentioned in layer data
+        if 'sieve_200_passing' in layer and layer['sieve_200_passing'] is not None:
+            percentage = float(layer['sieve_200_passing'])
+            st.success(f"✅ Found sieve #200 data in layer field: {percentage}% passing")
+            return percentage
+        # Check for related field names
+        for field_name in ['fines_content', 'percent_fines', 'fine_content', 'passing_200']:
+            if field_name in layer and layer[field_name] is not None:
+                percentage = float(layer[field_name])
+                st.success(f"✅ Found sieve #200 equivalent in '{field_name}': {percentage}% passing")
+                return percentage
+        # Log that no sieve analysis was found
+        st.info(f"🔍 No sieve #200 analysis data found in layer description or fields")
+        return None
+    def _process_st_sample(self, layer: Dict) -> Dict:
+        """
+        Process Shelby Tube (ST) sample - use unconfined compression test (Su) values
+        """
+        layer['processing_method'] = 'ST - Unconfined Compression Test'
+        # Look for Su values in the data
+        su_value = self._extract_su_value(layer)
+        if su_value is not None:
+            layer['strength_parameter'] = 'Su'
+            layer['strength_value'] = su_value
+            layer['su_source'] = 'Unconfined Compression Test'
+        else:
+            # If no Su value found, check for SPT and convert
+            spt_value = self._extract_spt_value(layer)
+            if spt_value is not None:
+                su_calculated = self._convert_spt_to_su(spt_value)
+                layer['strength_parameter'] = 'Su'
+                layer['strength_value'] = su_calculated
+                layer['su_source'] = f'Calculated from SPT-N={spt_value} (Su=5*N)'
+                layer['original_spt'] = spt_value
+        return layer
+    def _process_ss_sample(self, layer: Dict) -> Dict:
+        """
+        Process Split Spoon (SS) sample - ALWAYS use SPT values and convert to Su using Su=5*N
+        FOR SS SAMPLES: IGNORE any unconfined compression test Su values, ONLY use calculated Su=5*N
+        """
+        layer['processing_method'] = 'SS - SPT Conversion (Su=5*N)'
+        # CRITICAL: For SS samples, extract the raw SPT-N value and calculate Su from it
+        spt_value = self._extract_spt_value(layer)
+        soil_type = layer.get('soil_type', 'clay')
+        if spt_value is not None:
+            if soil_type == 'clay':
+                # MANDATORY: Convert SPT to undrained shear strength using Su = 5*N
+                # IGNORE any existing Su values from unconfined compression tests
+                calculated_su = self._convert_spt_to_su(spt_value)
+                # Override any existing Su values for SS samples
+                layer['strength_parameter'] = 'Su'
+                layer['strength_value'] = calculated_su
+                layer['su_source'] = f'Calculated from raw N={spt_value} (Su=5*N) - SS Sample'
+                layer['original_spt'] = spt_value
+                # Clear any conflicting unconfined compression data for SS samples
+                if 'unconfined_su' in layer:
+                    layer['unconfined_su_ignored'] = layer.pop('unconfined_su')
+                    st.warning(f"⚠️ SS Sample: Ignored unconfined compression Su, using calculated Su={calculated_su:.0f} kPa from N={spt_value}")
+                st.success(f"✅ SS Sample: Su = 5 × {spt_value} = {calculated_su:.0f} kPa")
+            elif soil_type in ['sand', 'silt']:
+                # Convert SPT to friction angle for granular soils
+                phi_value = self._convert_spt_to_friction_angle(spt_value)
+                layer['strength_parameter'] = 'φ'
+                layer['strength_value'] = phi_value
+                layer['friction_angle'] = phi_value
+                layer['phi_source'] = f'Calculated from raw N={spt_value} (Peck method) - SS Sample'
+                layer['original_spt'] = spt_value
+                st.success(f"✅ SS Sample: φ = {phi_value:.1f}° from N={spt_value}")
+            else:
+                # Keep SPT value for other soil types
+                layer['strength_parameter'] = 'SPT-N'
+                layer['strength_value'] = spt_value
+                layer['original_spt'] = spt_value
+                st.info(f"📊 SS Sample: Using raw N={spt_value} for {soil_type}")
+        else:
+            st.error(f"❌ SS Sample: No SPT-N value found in layer data")
+        return layer
+    def _process_default_sample(self, layer: Dict) -> Dict:
+        """
+        Process sample with unknown type - use available data intelligently
+        """
+        layer['processing_method'] = 'Default - Based on available data'
+        # Try to identify and process based on existing parameters
+        existing_param = layer.get('strength_parameter', '').lower()
+        if 'su' in existing_param:
+            # Already has Su value
+            return self._process_st_sample(layer)
+        elif 'spt' in existing_param or 'n' in existing_param:
+            # Has SPT value
+            return self._process_ss_sample(layer)
+        else:
+            # Make best guess based on strength value
+            strength_val = layer.get('strength_value', 0)
+            if strength_val and strength_val > 50:
+                # Likely SPT value
+                layer['strength_parameter'] = 'SPT-N'
+                return self._process_ss_sample(layer)
+            else:
+                # Likely Su value
+                layer['strength_parameter'] = 'Su'
+                return self._process_st_sample(layer)
+    def _extract_su_value(self, layer: Dict) -> Optional[float]:
+        """
+        Enhanced Su (undrained shear strength) extraction with MANDATORY unit conversion checking
+        CRITICAL: All Su values must be converted to kPa before processing
+        """
+        # Check direct Su field first - but validate units
+        if layer.get('strength_parameter') == 'Su' and layer.get('strength_value') is not None:
+            su_value = float(layer['strength_value'])
+            # Check if this value needs unit conversion (warn if suspiciously low/high)
+            if su_value < 5:
+                st.warning(f"⚠️ Su value {su_value} seems low - verify it's in kPa, not MPa or other units")
+            elif su_value > 2000:
+                st.warning(f"⚠️ Su value {su_value} seems high - verify it's in kPa, not psi or other units")
+            return su_value
+        # Look in description for Su values with enhanced unit detection
+        description = layer.get('description', '')
+        # CRITICAL: Enhanced patterns with explicit unit capture for conversion
+        patterns = [
+            # Direct Su values with units - CAPTURE UNITS EXPLICITLY
+            r'su[:\s=]*(\d+(?:\.\d+)?)\s*(kpa|kn/m2|kn/m²|psi|psf|ksc|kg/cm2|kg/cm²|t/m2|t/m²|ton/m2|ton/m²|tonnes?/m2|tonnes?/m²|mpa)',
+            r'undrained[:\s]*shear[:\s]*strength[:\s]*(\d+(?:\.\d+)?)\s*(kpa|kn/m2|kn/m²|psi|psf|ksc|kg/cm2|kg/cm²|t/m2|t/m²|ton/m2|ton/m²|tonnes?/m2|tonnes?/m²|mpa)',
+            r'shear\s*strength[:\s]*(\d+(?:\.\d+)?)\s*(kpa|kn/m2|kn/m²|psi|psf|ksc|kg/cm2|kg/cm²|t/m2|t/m²|ton/m2|ton/m²|tonnes?/m2|tonnes?/m²|mpa)',
+            r'ucs[:\s]*(\d+(?:\.\d+)?)\s*(kpa|kn/m2|kn/m²|psi|psf|ksc|kg/cm2|kg/cm²|t/m2|t/m²|ton/m2|ton/m²|tonnes?/m2|tonnes?/m²|mpa)',
+            r'unconfined[:\s]*compression[:\s]*(\d+(?:\.\d+)?)\s*(kpa|kn/m2|kn/m²|psi|psf|ksc|kg/cm2|kg/cm²|t/m2|t/m²|ton/m2|ton/m²|tonnes?/m2|tonnes?/m²|mpa)',
+            # Equation-style patterns
+            r'su\s*=\s*(\d+(?:\.\d+)?)\s*(kpa|kn/m2|kn/m²|psi|psf|ksc|kg/cm2|kg/cm²|t/m2|t/m²|ton/m2|ton/m²|tonnes?/m2|tonnes?/m²|mpa)',
+            r'strength\s*=\s*(\d+(?:\.\d+)?)\s*(kpa|kn/m2|kn/m²|psi|psf|ksc|kg/cm2|kg/cm²|t/m2|t/m²|ton/m2|ton/m²|tonnes?/m2|tonnes?/m²|mpa)',
+            # Embedded unit patterns
+            r'(\d+(?:\.\d+)?)\s*(kpa|kn/m2|kn/m²)\s*(?:su|strength)',
+            r'(\d+(?:\.\d+)?)\s*(ksc|kg/cm2|kg/cm²)\s*(?:su|strength)',
+            r'(\d+(?:\.\d+)?)\s*(t/m2|t/m²|ton/m2|ton/m²|tonnes?/m2|tonnes?/m²)\s*(?:su|strength)',
+            r'(\d+(?:\.\d+)?)\s*(psi|psf)\s*(?:su|strength)',
+            r'(\d+(?:\.\d+)?)\s*(mpa)\s*(?:su|strength)',
+            # Common non-SI units that need conversion
+            r'(\d+(?:\.\d+)?)\s*ksc\b',  # ksc without explicit "su"
+            r'(\d+(?:\.\d+)?)\s*t/m²?\b',  # tonnes/m²
+            r'(\d+(?:\.\d+)?)\s*psi\b',   # psi
+        ]
+        for pattern in patterns:
+            match = re.search(pattern, description, re.IGNORECASE)
+            if match:
+                value = float(match.group(1))
+                unit = match.group(2).lower() if len(match.groups()) > 1 and match.group(2) else 'kpa'
+                # CRITICAL: Alert if unit conversion is needed
+                if unit != 'kpa':
+                    st.warning(f"🔧 UNIT CONVERSION REQUIRED: Found Su = {value} {unit.upper()}")
+                # Convert to kPa with detailed logging
+                converted_value = self._convert_pressure_to_kpa(value, unit)
+                # Store original values for verification
+                layer['original_su_value'] = value
+                layer['original_su_unit'] = unit.upper()
+                layer['converted_su_note'] = f"Converted from {value} {unit.upper()} to {converted_value:.1f} kPa"
+                # Enhanced validation with context-aware warnings
+                if converted_value < 1:
+                    st.error(f"❌ Very low Su = {converted_value:.3f} kPa after conversion. Check original value: {value} {unit}")
+                elif converted_value > 2000:
+                    st.warning(f"⚠️ Very high Su = {converted_value:.0f} kPa after conversion from {value} {unit}. Verify this is correct.")
+                elif 1 <= converted_value <= 1000:
+                    st.success(f"✅ Su = {converted_value:.1f} kPa (converted from {value} {unit.upper()})")
+                else:
+                    st.info(f"📊 Su = {converted_value:.1f} kPa (converted from {value} {unit.upper()}) - unusual but accepted")
+                return converted_value
+        # Check for unitless Su values (assume kPa but warn)
+        unitless_patterns = [
+            r'su[:\s=]*(\d+(?:\.\d+)?)\b(?!\s*[a-zA-Z])',  # Su value not followed by units
+            r'shear\s*strength[:\s]*(\d+(?:\.\d+)?)\b(?!\s*[a-zA-Z])',
+            r'unconfined[:\s]*(\d+(?:\.\d+)?)\b(?!\s*[a-zA-Z])',
+        ]
+        for pattern in unitless_patterns:
+            match = re.search(pattern, description, re.IGNORECASE)
+            if match:
+                value = float(match.group(1))
+                st.warning(f"⚠️ Found Su = {value} WITHOUT UNITS! Assuming kPa - please verify.")
+                layer['assumed_unit_warning'] = f"Assumed {value} is in kPa (no units specified)"
+                return value
+        # Check for explicit Su field in layer data
+        if 'su_value' in layer and layer['su_value'] is not None:
+            value = float(layer['su_value'])
+            st.info(f"📊 Using Su = {value:.1f} from field 'su_value' (assumed kPa)")
+            return value
+        # Check for other strength-related fields that might contain Su
+        for field_name in ['undrained_strength', 'unconfined_strength', 'cohesion']:
+            if field_name in layer and layer[field_name] is not None:
+                value = float(layer[field_name])
+                st.info(f"📊 Using Su = {value:.1f} kPa from field '{field_name}' (assumed kPa)")
+                return value
+        return None
+    def _extract_spt_value(self, layer: Dict) -> Optional[float]:
+        """
+        Enhanced SPT-N value extraction for SS samples - USE RAW N VALUE ONLY, NOT N-CORRECTED
+        Improved pattern matching for better SS layer division
+        """
+        # Check direct SPT field
+        if layer.get('strength_parameter') == 'SPT-N' and layer.get('strength_value'):
+            return float(layer['strength_value'])
+        # Look in description for SPT values - PRIORITIZE RAW N VALUES
+        description = layer.get('description', '')
+        # ENHANCED: Look for raw N value patterns with better precision
+        raw_n_patterns = [
+            # High priority patterns for raw N values
+            r'\braw[:\s]*n[:\s=]*(\d+(?:\.\d+)?)',  # Raw N value
+            r'\bfield[:\s]*n[:\s=]*(\d+(?:\.\d+)?)',  # Field N value
+            r'\bmeasured[:\s]*n[:\s=]*(\d+(?:\.\d+)?)',  # Measured N value
+            r'\bactual[:\s]*n[:\s=]*(\d+(?:\.\d+)?)',  # Actual N value
+            r'\bobserved[:\s]*n[:\s=]*(\d+(?:\.\d+)?)',  # Observed N value
+            # Standard N patterns NOT followed by correction terms
+            r'\bn[:\s=]*(\d+(?:\.\d+)?)\b(?!\s*[-]?(?:corr|correct|adj|adjust))',  # N value NOT corrected
+            r'\bspt[:\s]*n[:\s=]*(\d+(?:\.\d+)?)\b(?!\s*[-]?(?:corr|correct|adj|adjust))',  # SPT-N NOT corrected
+            r'\bn[-\s]?value[:\s=]*(\d+(?:\.\d+)?)\b(?!\s*[-]?(?:corr|correct|adj|adjust))',  # N-value NOT corrected
+            r'\bn\s*=\s*(\d+(?:\.\d+)?)\b(?!\s*[-]?(?:corr|correct|adj|adjust))',  # N = value NOT corrected
+            # Blow count patterns
+            r'\bblow[s]?[:\s]*count[:\s=]*(\d+(?:\.\d+)?)\b(?!\s*[-]?(?:corr|correct|adj|adjust))',
+            r'\bblows[:\s]*per[:\s]*foot[:\s=]*(\d+(?:\.\d+)?)',
+            r'\bblow[s]?[:\s=]*(\d+(?:\.\d+)?)\b(?!\s*[-]?(?:corr|correct|adj|adjust))',
+            # SS sample specific patterns
+            r'\bss[-\s]*\d*[:\s]*n[:\s=]*(\d+(?:\.\d+)?)',  # SS sample with N
+            r'\bsplit[:\s]*spoon[:\s]*n[:\s=]*(\d+(?:\.\d+)?)',  # Split spoon N
+        ]
+        # First try to find raw N values with enhanced logging
+        for i, pattern in enumerate(raw_n_patterns):
+            match = re.search(pattern, description, re.IGNORECASE)
+            if match:
+                n_value = float(match.group(1))
+                pattern_type = ["Raw N", "Field N", "Measured N", "Actual N", "Observed N",
+                              "Standard N", "SPT-N", "N-value", "N=", "Blow count",
+                              "Blows/ft", "Blows", "SS N", "Split spoon N"][min(i, 13)]
+                st.success(f"✅ SS Sample: Using {pattern_type} = {n_value} from: '{match.group(0)}'")
+                # Additional validation for SS samples
+                if n_value > 100:
+                    st.warning(f"⚠️ Very high N value ({n_value}) detected. Please verify this is correct.")
+                elif n_value == 0:
+                    st.warning(f"⚠️ Zero N value detected. May indicate very soft soil or measurement issue.")
+                return n_value
+        # Enhanced fallback patterns with warnings
+        fallback_patterns = [
+            r'\bn[:\s=]*(\d+(?:\.\d+)?)',
+            r'\bspt[:\s]*(\d+(?:\.\d+)?)',
+            r'(\d+(?:\.\d+)?)\s*(?:blow|n)',
+            r'penetration[:\s]*(\d+(?:\.\d+)?)',
+            r'resistance[:\s]*(\d+(?:\.\d+)?)'
+        ]
+        for pattern in fallback_patterns:
+            match = re.search(pattern, description, re.IGNORECASE)
+            if match:
+                n_value = float(match.group(1))
+                # Enhanced warnings for SS samples
+                warning_indicators = ['corr', 'correct', 'adj', 'adjust', 'modified', 'norm']
+                has_correction_indicator = any(indicator in description.lower() for indicator in warning_indicators)
+                if has_correction_indicator:
+                    st.error(f"❌ SS Sample: Found N = {n_value} but description contains correction terms. This may be corrected N, not raw N!")
+                    st.info("💡 For SS samples, use only raw field N values (not corrected). Check original field logs.")
+                    # Still return the value but flag it
+                    layer['n_value_warning'] = f"Potentially corrected N value: {n_value}"
+                else:
+                    st.info(f"📊 SS Sample: Using N = {n_value} from: '{match.group(0)}' (fallback pattern)")
+                return n_value
+        # If no N value found, provide specific guidance for SS samples
+        st.error(f"❌ SS Sample: No SPT-N value found in layer data")
+        st.info("💡 SS samples require SPT-N values. Look for: N=X, SPT-N=X, raw N=X, field N=X, or blow count.")
+        return None
+    def _convert_spt_to_su(self, spt_n: float) -> float:
+        """
+        Convert SPT-N to undrained shear strength (Su) using Su = 5*N correlation
+        Enhanced for SS samples with validation
+        """
+        if spt_n <= 0:
+            st.warning(f"⚠️ Invalid N value ({spt_n}) for Su calculation. Using N=1 as minimum.")
+            spt_n = 1.0
+        su_calculated = 5.0 * spt_n
+        # Add validation and guidance for SS clay samples
+        if su_calculated < 10:
+            st.info(f"💡 Very low Su = {su_calculated:.0f} kPa from N={spt_n}. Indicates very soft clay.")
+        elif su_calculated > 500:
+            st.warning(f"⚠️ Very high Su = {su_calculated:.0f} kPa from N={spt_n}. Verify N value is raw (not corrected).")
+        return su_calculated
+    def _convert_spt_to_friction_angle(self, spt_n: float) -> float:
+        """
+        Enhanced SPT-N to friction angle conversion for sand/silt layers in SS samples
+        Uses improved Peck method with soil type considerations
+        """
+        if spt_n <= 0:
+            st.warning(f"⚠️ Invalid N value ({spt_n}) for friction angle calculation. Using N=1 as minimum.")
+            spt_n = 1.0
+        # Enhanced Peck correlation with improvements:
+        # φ = 27.1 + 0.3 * N - 0.00054 * N² (for fine to medium sand)
+        # Valid for N up to 50, with adjustments for different sand types
+        n_limited = min(spt_n, 50)  # Cap at 50 for correlation validity
+        # Base Peck correlation
+        phi = 27.1 + 0.3 * n_limited - 0.00054 * (n_limited ** 2)
+        # Ensure reasonable minimum
+        phi_final = max(phi, 28)  # Minimum reasonable friction angle for sand
+        phi_final = min(phi_final, 45)  # Maximum reasonable friction angle
+        # Add validation and guidance for SS sand samples
+        if phi_final < 30:
+            st.info(f"💡 Low φ = {phi_final:.1f}° from N={spt_n}. Indicates loose sand or silty sand.")
+        elif phi_final > 40:
+            st.info(f"💡 High φ = {phi_final:.1f}° from N={spt_n}. Indicates dense, well-graded sand.")
+        # Special handling for very low or high N values
+        if spt_n < 4:
+            st.warning(f"⚠️ Very low N={spt_n} for sand. May indicate loose sand or silt. Consider checking soil classification.")
+        elif spt_n > 40:
+            st.info(f"💡 Very high N={spt_n} for sand. Indicates very dense sand or possible gravel content.")
+        return phi_final
+    def _convert_pressure_to_kpa(self, value: float, unit: str) -> float:
+        """
+        Enhanced pressure value conversion to kPa with comprehensive unit support
+        """
+        if not unit or unit.lower() in ['', 'none', 'null']:
+            return value  # Assume already in kPa if no unit specified
+        # Normalize unit string for better matching
+        unit_clean = unit.lower().replace('/', '').replace(' ', '').replace('²', '2').replace('³', '3')
+        # Remove common punctuation and extra characters
+        unit_clean = unit_clean.replace('.', '').replace('-', '').replace('_', '')
+        # Handle specific variations that need special processing
+        special_cases = {
+            # Tonne/ton variations
+            'tm2': 9.81, 'tonm2': 9.81, 'tonnesm2': 9.81, 'tonnem2': 9.81,
+            # kg/cm² variations
+            'kgcm2': 98.0, 'kgfcm2': 98.0,
+            # kN/m² variations
+            'knm2': 1.0,
+            # Other common variations
+            'psig': 6.895,  # psi gauge
+            'psia': 6.895,  # psi absolute
+            'psfa': 0.04788, # psf absolute
+            'torr': 0.133322, # torr (same as mmHg)
+        }
+        # Check special cases first
+        if unit_clean in special_cases:
+            conversion_factor = special_cases[unit_clean]
+        else:
+            # Standard conversion using enhanced dictionary
+            conversion_factor = self.unit_conversions.get(unit_clean, None)
+            # If no exact match found, try intelligent partial matching
+            if conversion_factor is None:
+                for known_unit, factor in self.unit_conversions.items():
+                    # Try various normalization approaches
+                    known_normalized = known_unit.replace('/', '').replace('²', '2').replace(' ', '')
+                    if known_normalized == unit_clean:
+                        conversion_factor = factor
+                        break
+                    # Check if unit contains the known unit (for compound units)
+                    if known_unit != unit_clean and known_unit in unit_clean:
+                        conversion_factor = factor
+                        break
+            # Final fallback - assume kPa if still no match found
+            if conversion_factor is None:
+                st.warning(f"⚠️ Unknown pressure unit '{unit}'. Assuming kPa - please verify.")
+                conversion_factor = 1.0
+        converted_value = value * conversion_factor
+        # Enhanced logging with validation
+        if conversion_factor != 1.0:
+            st.success(f"🔧 Unit conversion: {value} {unit} = {converted_value:.1f} kPa (×{conversion_factor})")
+            # Add validation warnings for unusual results
+            if converted_value > 10000:
+                st.warning(f"⚠️ Very high pressure result ({converted_value:.0f} kPa). Please verify unit conversion.")
+            elif converted_value < 0.1 and value > 0:
+                st.warning(f"⚠️ Very low pressure result ({converted_value:.3f} kPa). Please verify unit conversion.")
+        return converted_value
+    def _convert_to_si_units(self, layer: Dict) -> Dict:
+        """
+        Convert all measurements to SI units
+        """
+        # Convert depths to meters
+        for depth_field in ['depth_from', 'depth_to']:
+            if depth_field in layer:
+                depth_val, depth_unit = self._extract_value_and_unit(
+                    str(layer[depth_field]), default_unit='m'
+                )
+                layer[depth_field] = self._convert_length_to_meters(depth_val, depth_unit)
+        # Convert strength values to appropriate SI units
+        if 'strength_value' in layer and 'strength_parameter' in layer:
+            param = layer['strength_parameter'].lower()
+            if param == 'su':
+                # Convert Su to kPa
+                strength_val, strength_unit = self._extract_value_and_unit(
+                    str(layer['strength_value']), default_unit='kpa'
+                )
+                layer['strength_value'] = self._convert_pressure_to_kpa(strength_val, strength_unit)
+                layer['strength_unit'] = 'kPa'
+                # Validate Su value against water content if available
+                validation_result = self._validate_su_with_water_content(layer)
+                if validation_result.get('needs_unit_check'):
+                    st.warning(f"⚠️ Su-water content validation: {validation_result['message']}")
+                    layer['unit_validation_warning'] = validation_result['message']
+                    if validation_result['recommendations']:
+                        st.info("💡 Recommendations: " + "; ".join(validation_result['recommendations']))
+            elif param in ['φ', 'phi', 'friction_angle']:
+                # Friction angle should be in degrees (already SI)
+                layer['strength_unit'] = 'degrees'
+            elif param == 'spt-n':
+                # SPT-N is dimensionless
+                layer['strength_unit'] = 'blows/30cm'
+        return layer
+    def _extract_value_and_unit(self, value_str: str, default_unit: str = '') -> Tuple[float, str]:
+        """
+        Extract numeric value and unit from a string
+        """
+        # Remove extra spaces and convert to lowercase
+        clean_str = value_str.strip().lower()
+        # Pattern to match number followed by optional unit
+        pattern = r'(\d+(?:\.\d+)?)\s*([a-zA-Z/²]+)?'
+        match = re.search(pattern, clean_str)
+        if match:
+            value = float(match.group(1))
+            unit = match.group(2) if match.group(2) else default_unit
+            return value, unit
+        try:
+            return float(clean_str), default_unit
+        except ValueError:
+            return 0.0, default_unit
+    def _convert_length_to_meters(self, value: float, unit: str) -> float:
+        """
+        Convert length value to meters
+        """
+        unit_clean = unit.lower().replace(' ', '')
+        conversion_factor = self.unit_conversions.get(unit_clean, 1.0)
+        return value * conversion_factor
+    def _detect_t_m2_unit_error(self, layer: Dict) -> Dict:
+        """
+        Detect if LLM failed to convert t/m² units to kPa
+        This is the most common unit conversion error
+        """
+        result = {"needs_conversion": False, "critical_error": False}
+        # Only check layers with Su values
+        if layer.get("strength_parameter") != "Su" or not layer.get("strength_value"):
+            return result
+        su = float(layer["strength_value"])
+        wc = layer.get("water_content", 0)
+        description = layer.get("description", "")
+        # Critical detection: Su values that are likely t/m² but not converted
+        # Typical t/m² values are 1-8, typical kPa values are 10-400 for clay
+        # Pattern 1: Su 1-8 with reasonable water content (15-50%)
+        if 1.0 <= su <= 8.0 and 15 <= wc <= 50:
+            converted_su = su * 9.81
+            result.update({
+                "needs_conversion": True,
+                "critical_error": True,
+                "original_su": su,
+                "converted_su": converted_su,
+                "unit_error": "t/m²",
+                "message": f"⚠️ CRITICAL: Su={su:.2f} appears to be in t/m² units, should be {converted_su:.1f} kPa",
+                "correction": f"{su:.2f} t/m² × 9.81 = {converted_su:.1f} kPa"
+            })
+        # Pattern 2: Very low Su (<5) with low water content - could be t/m²
+        elif su < 5.0 and wc > 0 and wc < 25:
+            converted_su = su * 9.81
+            result.update({
+                "needs_conversion": True,
+                "critical_error": True,
+                "original_su": su,
+                "converted_su": converted_su,
+                "unit_error": "t/m²",
+                "message": f"⚠️ POSSIBLE: Su={su:.2f} might be in t/m² units, check if should be {converted_su:.1f} kPa",
+                "correction": f"{su:.2f} t/m² × 9.81 = {converted_su:.1f} kPa"
+            })
+        # Pattern 3: Check description for t/m² mentions
+        if any(unit in description.lower() for unit in ['t/m²', 't/m2', 'ton/m²', 'ton/m2', 'tonnes/m²']):
+            if su < 10:  # If description mentions t/m² but Su is low, likely not converted
+                converted_su = su * 9.81
+                result.update({
+                    "needs_conversion": True,
+                    "critical_error": True,
+                    "original_su": su,
+                    "converted_su": converted_su,
+                    "unit_error": "t/m² (found in description)",
+                    "message": f"⚠️ CRITICAL: Description mentions t/m² but Su={su:.2f} appears unconverted, should be {converted_su:.1f} kPa",
+                    "correction": f"{su:.2f} t/m² × 9.81 = {converted_su:.1f} kPa"
+                })
+        return result
+    def _validate_su_with_water_content(self, layer: Dict) -> Dict:
+        """
+        ENHANCED Su-water content validation with comprehensive unit checking
+        Standard correlations for clay (empirical relationships):
+        - Very soft clay: Su < 25 kPa, w% > 40%
+        - Soft clay: Su 25-50 kPa, w% 30-40%
+        - Medium clay: Su 50-100 kPa, w% 20-30%
+        - Stiff clay: Su 100-200 kPa, w% 15-25%
+        - Very stiff clay: Su 200-400 kPa, w% 10-20%
+        - Hard clay: Su > 400 kPa, w% < 15%
+        Key unit conversions to check:
+        - t/m² → kPa: ×9.81 (CRITICAL)
+        - ksc → kPa: ×98.0
+        - psi → kPa: ×6.895
+        - MPa → kPa: ×1000
+        """
+        validation_result = {
+            'valid': True,
+            'needs_unit_check': False,
+            'critical_unit_error': False,
+            'suggested_conversion': None,
+            'message': '',
+            'recommendations': [],
+            'recheck_image': False
+        }
+        su_value = layer.get('strength_value')
+        water_content = layer.get('water_content')
+        soil_type = layer.get('soil_type', '')
+        description = layer.get('description', '')
+        # Only validate for clay layers with both Su and water content
+        if soil_type != 'clay' or not su_value or not water_content:
+            return validation_result
+        try:
+            su = float(su_value)
+            wc = float(water_content)
+            # STEP 1: Check for t/m² unit errors first (most common issue)
+            t_m2_check = self._detect_t_m2_unit_error(layer)
+            if t_m2_check.get('critical_error'):
+                validation_result.update({
+                    'critical_unit_error': True,
+                    'needs_conversion': True,
+                    'original_value': t_m2_check['original_su'],
+                    'suggested_value': t_m2_check['converted_su'],
+                    'unit_error_type': t_m2_check['unit_error'],
+                    'suggested_conversion': t_m2_check['correction'],
+                    'message': t_m2_check['message'],
+                    'recheck_image': True,
+                    'reload_picture': True
+                })
+                return validation_result
+            # STEP 2: Check for other unit conversion errors
+            unit_check_results = self._check_su_unit_conversions(su, wc, description)
+            if unit_check_results['needs_conversion']:
+                validation_result.update(unit_check_results)
+                validation_result['critical_unit_error'] = True
+                validation_result['recheck_image'] = True
+                return validation_result
+            # STEP 3: Detailed correlation analysis
+            inconsistencies = []
+            correlation_score = self._calculate_correlation_score(su, wc)
+            # Very specific clay consistency checks
+            if su < 25 and wc < 30:
+                inconsistencies.append(f"Very soft clay (Su={su:.0f}kPa) typically has w%>30%, found {wc:.1f}%")
+                if wc < 20:
+                    validation_result['recheck_image'] = True
+                    inconsistencies.append("VERIFY: Water content seems too low for very soft clay")
+            if su > 400 and wc > 30:
+                inconsistencies.append(f"Hard clay (Su={su:.0f}kPa) typically has w%<20%, found {wc:.1f}%")
+                validation_result['recheck_image'] = True
+                inconsistencies.append("VERIFY: Water content seems too high for hard clay")
+            # Medium-range mismatches
+            if 50 <= su <= 200 and (wc > 45 or wc < 10):
+                inconsistencies.append(f"Medium-stiff clay (Su={su:.0f}kPa) with unusual w%={wc:.1f}%")
+                validation_result['recheck_image'] = True
+            # STEP 4: Empirical correlation bounds (Terzaghi-Peck relationships)
+            expected_su_range = self._get_expected_su_range(wc)
+            if su < expected_su_range['min'] * 0.2 or su > expected_su_range['max'] * 5:
+                validation_result['needs_unit_check'] = True
+                validation_result['recheck_image'] = True
+                inconsistencies.append(f"Su-w% correlation severely off: Expected {expected_su_range['min']:.0f}-{expected_su_range['max']:.0f}kPa for w%={wc:.1f}%, got {su:.0f}kPa")
+            # STEP 4: Finalize results
+            if inconsistencies:
+                validation_result['valid'] = False
+                validation_result['message'] = '; '.join(inconsistencies)
+                # Enhanced recommendations
+                if validation_result['needs_unit_check']:
+                    validation_result['recommendations'].extend([
+                        "⚠️ CRITICAL: Check Su unit conversion carefully",
+                        "t/m² → kPa: multiply by 9.81",
+                        "ksc → kPa: multiply by 98.0",
+                        "psi → kPa: multiply by 6.895",
+                        "MPa → kPa: multiply by 1000",
+                        "🔍 Re-examine the original image/document"
+                    ])
+                if validation_result['recheck_image']:
+                    validation_result['recommendations'].extend([
+                        "📷 RECHECK IMAGE: Values seem inconsistent",
+                        "🔄 Consider reloading the image",
+                        "📋 Verify both Su and water content readings"
+                    ])
+            else:
+                validation_result['message'] = f"Su-water content correlation acceptable (score: {correlation_score:.1f})"
+        except (ValueError, TypeError) as e:
+            validation_result['valid'] = False
+            validation_result['message'] = f"Could not validate Su-water content: {str(e)}"
+            validation_result['recheck_image'] = True
+        return validation_result
+    def _check_su_unit_conversions(self, su: float, wc: float, description: str) -> Dict:
+        """Check for specific unit conversion errors"""
+        result = {
+            'needs_conversion': False,
+            'suggested_conversion': None,
+            'critical_unit_error': False,
+            'message': ''
+        }
+        # Check for t/m² that wasn't converted (very common error)
+        if 2 <= su <= 10 and 15 <= wc <= 40:
+            suggested_su = su * 9.81
+            result.update({
+                'needs_conversion': True,
+                'suggested_conversion': f"{su} t/m² → {suggested_su:.1f} kPa (×9.81)",
+                'critical_unit_error': True,
+                'message': f"CRITICAL: Su={su:.1f} appears to be in t/m² (should be {suggested_su:.1f} kPa)"
+            })
+            return result
+        # Check for ksc that wasn't converted
+        if 0.5 <= su <= 5 and 15 <= wc <= 50:
+            suggested_su = su * 98.0
+            result.update({
+                'needs_conversion': True,
+                'suggested_conversion': f"{su} ksc → {suggested_su:.1f} kPa (×98)",
+                'critical_unit_error': True,
+                'message': f"CRITICAL: Su={su:.1f} appears to be in ksc (should be {suggested_su:.1f} kPa)"
+            })
+            return result
+        # Check for psi that wasn't converted (high values)
+        if 50 <= su <= 500 and 10 <= wc <= 35:
+            suggested_su = su * 6.895
+            result.update({
+                'needs_conversion': True,
+                'suggested_conversion': f"{su} psi → {suggested_su:.1f} kPa (×6.895)",
+                'critical_unit_error': True,
+                'message': f"CRITICAL: Su={su:.0f} appears to be in psi (should be {suggested_su:.1f} kPa)"
+            })
+            return result
+        # Check for MPa that wasn't converted (very low values)
+        if 0.01 <= su <= 0.5 and 10 <= wc <= 40:
+            suggested_su = su * 1000
+            result.update({
+                'needs_conversion': True,
+                'suggested_conversion': f"{su} MPa → {suggested_su:.1f} kPa (×1000)",
+                'critical_unit_error': True,
+                'message': f"CRITICAL: Su={su:.2f} appears to be in MPa (should be {suggested_su:.1f} kPa)"
+            })
+            return result
+        return result
+    def _get_expected_su_range(self, water_content: float) -> Dict[str, float]:
+        """Get expected Su range based on water content (empirical correlations)"""
+        wc = water_content
+        # Conservative empirical relationships
+        if wc >= 50:
+            return {'min': 5, 'max': 20}    # Very soft clay
+        elif wc >= 40:
+            return {'min': 10, 'max': 35}   # Soft clay
+        elif wc >= 30:
+            return {'min': 20, 'max': 60}   # Medium clay
+        elif wc >= 20:
+            return {'min': 40, 'max': 150}  # Stiff clay
+        elif wc >= 15:
+            return {'min': 80, 'max': 250}  # Very stiff clay
+        else:
+            return {'min': 150, 'max': 500} # Hard clay
+    def _calculate_correlation_score(self, su: float, wc: float) -> float:
+        """Calculate correlation score (0-10, higher is better)"""
+        # Simple scoring based on typical relationships
+        expected_range = self._get_expected_su_range(wc)
+        if expected_range['min'] <= su <= expected_range['max']:
+            return 10.0  # Perfect correlation
+        elif expected_range['min'] * 0.5 <= su <= expected_range['max'] * 2:
+            return 7.0   # Good correlation
+        elif expected_range['min'] * 0.2 <= su <= expected_range['max'] * 5:
+            return 4.0   # Acceptable correlation
+        else:
+            return 1.0   # Poor correlation
+    def _add_engineering_parameters(self, layer: Dict) -> Dict:
+        """
+        Add additional engineering parameters based on soil properties
+        """
+        soil_type = layer.get('soil_type', '')
+        # Add typical engineering properties based on soil type and strength
+        if soil_type == 'clay':
+            su_value = layer.get('strength_value', 0)
+            if su_value > 0:
+                # Estimate consistency based on Su
+                if su_value < 25:
+                    layer['consistency'] = 'very soft'
+                elif su_value < 50:
+                    layer['consistency'] = 'soft'
+                elif su_value < 100:
+                    layer['consistency'] = 'medium'
+                elif su_value < 200:
+                    layer['consistency'] = 'stiff'
+                elif su_value < 400:
+                    layer['consistency'] = 'very stiff'
+                else:
+                    layer['consistency'] = 'hard'
+                # Estimate unit weight (kN/m³)
+                layer['unit_weight'] = 16 + su_value / 50  # Empirical correlation
+                layer['unit_weight_unit'] = 'kN/m³'
+        elif soil_type in ['sand', 'silt']:
+            # For sand/silt, use SPT-N or friction angle
+            if 'original_spt' in layer:
+                spt_n = layer['original_spt']
+                # Estimate relative density based on SPT-N
+                if spt_n < 4:
+                    layer['consistency'] = 'very loose'
+                elif spt_n < 10:
+                    layer['consistency'] = 'loose'
+                elif spt_n < 30:
+                    layer['consistency'] = 'medium dense'
+                elif spt_n < 50:
+                    layer['consistency'] = 'dense'
+                else:
+                    layer['consistency'] = 'very dense'
+                # Estimate unit weight (kN/m³)
+                layer['unit_weight'] = 14 + spt_n / 5  # Empirical correlation
+                layer['unit_weight_unit'] = 'kN/m³'
+        return layer
+    def _check_clay_consistency(self, layer: Dict) -> Dict:
+        """
+        Check consistency between water content and Su for clay soils
+        """
+        soil_type = layer.get('soil_type', '')
+        if soil_type != 'clay':
+            return layer
+        su_value = layer.get('strength_value')
+        water_content = self._extract_water_content(layer)
+        if su_value and water_content:
+            # Perform consistency check
+            consistency_result = self._validate_clay_water_content_su_relationship(
+                water_content, su_value
+            )
+            layer['water_content'] = water_content
+            layer['water_content_unit'] = '%'
+            layer['clay_consistency_check'] = consistency_result
+            # Add consistency notes
+            if consistency_result['is_consistent']:
+                layer['consistency_note'] = f"✅ Water content ({water_content}%) consistent with Su ({su_value} kPa)"
+            else:
+                layer['consistency_note'] = f"⚠️ {consistency_result['warning']}"
+        return layer
+    def _extract_water_content(self, layer: Dict) -> Optional[float]:
+        """
+        Extract water content from layer data
+        """
+        # Check if water content is directly specified
+        if 'water_content' in layer:
+            return float(layer['water_content'])
+        # Look in description for water content values
+        description = layer.get('description', '')
+        patterns = [
+            r'w[:\s=]*(\d+(?:\.\d+)?)\s*%',
+            r'water\s*content[:\s]*(\d+(?:\.\d+)?)\s*%',
+            r'moisture\s*content[:\s]*(\d+(?:\.\d+)?)\s*%',
+            r'wc[:\s=]*(\d+(?:\.\d+)?)\s*%',
+            r'(\d+(?:\.\d+)?)\s*%\s*moisture',
+            r'(\d+(?:\.\d+)?)\s*%\s*water'
+        ]
+        for pattern in patterns:
+            match = re.search(pattern, description, re.IGNORECASE)
+            if match:
+                return float(match.group(1))
+        return None
+    def _validate_clay_water_content_su_relationship(self, water_content: float, su_value: float) -> Dict:
+        """
+        Validate the relationship between water content and undrained shear strength for clay
+        Enhanced analysis for ST layer soil division based on water content and unconfined test results:
+        - Higher water content generally corresponds to lower Su
+        - Different clay types have different relationships
+        - Consider stress history and plasticity effects
+        """
+        # Enhanced empirical relationships for clay consistency with expanded ranges
+        consistency_ranges = {
+            'very_soft': {'w_range': (40, 150), 'su_range': (0, 25), 'description': 'High plasticity, organic clays'},
+            'soft': {'w_range': (25, 70), 'su_range': (25, 50), 'description': 'Normally consolidated clays'},
+            'medium': {'w_range': (18, 40), 'su_range': (50, 100), 'description': 'Lightly overconsolidated clays'},
+            'stiff': {'w_range': (12, 28), 'su_range': (100, 200), 'description': 'Overconsolidated clays'},
+            'very_stiff': {'w_range': (8, 20), 'su_range': (200, 400), 'description': 'Heavily overconsolidated clays'},
+            'hard': {'w_range': (5, 15), 'su_range': (400, 1000), 'description': 'Desiccated or cemented clays'}
+        }
+        # Determine expected consistency based on Su
+        su_consistency = None
+        for consistency, ranges in consistency_ranges.items():
+            if ranges['su_range'][0] <= su_value <= ranges['su_range'][1]:
+                su_consistency = consistency
+                break
+        # Determine expected consistency based on water content
+        w_consistency = None
+        for consistency, ranges in consistency_ranges.items():
+            if ranges['w_range'][0] <= water_content <= ranges['w_range'][1]:
+                w_consistency = consistency
+                break
+        # Check consistency
+        result = {
+            'water_content': water_content,
+            'su_value': su_value,
+            'w_consistency': w_consistency,
+            'su_consistency': su_consistency,
+            'is_consistent': False,
+            'warning': '',
+            'note': ''
+        }
+        if su_consistency and w_consistency:
+            if su_consistency == w_consistency:
+                result['is_consistent'] = True
+                result['note'] = f"Water content and Su both indicate {su_consistency.replace('_', ' ')} clay"
+            else:
+                result['warning'] = f"Inconsistent: Water content suggests {w_consistency.replace('_', ' ')} clay, but Su suggests {su_consistency.replace('_', ' ')} clay"
+        elif su_consistency and not w_consistency:
+            if water_content > 60:
+                result['warning'] = f"Very high water content ({water_content}%) for Su = {su_value} kPa. Check if clay is highly plastic or organic."
+            elif water_content < 10:
+                result['warning'] = f"Very low water content ({water_content}%) for clay. Check if sample was dried or is highly over-consolidated."
+            else:
+                result['note'] = f"Water content outside typical ranges but Su indicates {su_consistency.replace('_', ' ')} clay"
+        elif w_consistency and not su_consistency:
+            result['warning'] = f"Su value ({su_value} kPa) outside typical ranges for clay with {water_content}% water content"
+        else:
+            result['warning'] = f"Both water content ({water_content}%) and Su ({su_value} kPa) outside typical clay ranges"
+        # Enhanced empirical correlation checks for ST layer division
+        if water_content and su_value:
+            # Advanced correlation analysis for ST samples
+            # Check for high plasticity clay indicators
+            if water_content > 80:
+                if su_value < 25:
+                    result['note'] = f"High plasticity clay indicated: w={water_content}%, Su={su_value} kPa. Possible CH or organic clay."
+                elif su_value > 50:
+                    result['warning'] = f"Inconsistent: Very high water content ({water_content}%) with moderate/high Su ({su_value} kPa). Check sample integrity or clay type."
+            # Check for low plasticity clay indicators
+            elif water_content < 15:
+                if su_value > 200:
+                    result['note'] = f"Low plasticity, overconsolidated clay: w={water_content}%, Su={su_value} kPa. Possible CL or aged clay."
+                elif su_value < 100:
+                    result['warning'] = f"Low water content ({water_content}%) with low Su ({su_value} kPa). Unusual - check if sample was dried."
+            # Check stress history indicators
+            ocr_estimate = self._estimate_overconsolidation_ratio(water_content, su_value)
+            if ocr_estimate > 1.5:
+                result['note'] = result.get('note', '') + f" Estimated OCR ≈ {ocr_estimate:.1f} (overconsolidated)"
+            elif ocr_estimate < 0.8:
+                result['note'] = result.get('note', '') + f" Estimated OCR ≈ {ocr_estimate:.1f} (possibly underconsolidated)"
+            # Soil division recommendations for ST samples
+            result['st_division_recommendation'] = self._recommend_st_layer_division(water_content, su_value)
+        return result
+    def _estimate_overconsolidation_ratio(self, water_content: float, su_value: float) -> float:
+        """
+        Estimate overconsolidation ratio (OCR) from water content and Su
+        Based on empirical correlations for ST samples
+        """
+        # Simplified correlation: OCR ≈ (Su_measured / Su_normally_consolidated)
+        # For normally consolidated clays: Su ≈ 0.22 * σ'v
+        # Approximate σ'v from water content using typical correlations
+        if water_content > 50:
+            # High water content suggests normally consolidated or slightly overconsolidated
+            expected_su_nc = max(15, 100 - water_content)  # Simplified correlation
+        else:
+            # Lower water content suggests overconsolidation
+            expected_su_nc = max(50, 150 - 2 * water_content)
+        ocr_estimate = su_value / expected_su_nc if expected_su_nc > 0 else 1.0
+        return max(0.5, min(ocr_estimate, 10.0))  # Reasonable bounds
+    def _recommend_st_layer_division(self, water_content: float, su_value: float) -> Dict:
+        """
+        Recommend layer division strategy for ST samples based on water content and Su results
+        """
+        recommendation = {
+            'division_strategy': 'single_layer',
+            'reason': 'Uniform properties',
+            'subdivision_criteria': []
+        }
+        # Check for significant property variations that suggest subdivision
+        if water_content > 60 and su_value > 75:
+            recommendation['division_strategy'] = 'check_variation'
+            recommendation['reason'] = 'Conflicting water content and strength - check for property variations'
+            recommendation['subdivision_criteria'].append('Water content variation > 10%')
+            recommendation['subdivision_criteria'].append('Su variation > 30%')
+        elif water_content < 20 and su_value < 80:
+            recommendation['division_strategy'] = 'check_variation'
+            recommendation['reason'] = 'Both low water content and Su - check for soil type variations'
+            recommendation['subdivision_criteria'].append('Plasticity index variations')
+            recommendation['subdivision_criteria'].append('Sieve analysis variations')
+        elif abs(water_content - 30) > 20 or su_value > 300:
+            recommendation['division_strategy'] = 'subdivide_recommended'
+            recommendation['reason'] = 'Extreme properties suggest heterogeneous layer'
+            recommendation['subdivision_criteria'].append('Test at multiple depths')
+            recommendation['subdivision_criteria'].append('Check for interbedded materials')
+        return recommendation
+    def get_processing_summary(self, layers: List[Dict]) -> Dict[str, Any]:
+        """
+        Generate a summary of the soil layer processing
+        """
+        summary = {
+            'total_layers': len(layers),
+            'st_samples': 0,
+            'ss_samples': 0,
+            'clay_layers': 0,
+            'sand_layers': 0,
+            'su_calculated': 0,
+            'phi_calculated': 0,
+            'clay_consistency_checks': 0,
+            'consistent_clays': 0,
+            'inconsistent_clays': 0,
+            'unit_conversions': [],
+            'processing_notes': []
+        }
+        for layer in layers:
+            # Count sample types
+            sample_type = layer.get('sample_type', '')
+            if sample_type == 'ST':
+                summary['st_samples'] += 1
+            elif sample_type == 'SS':
+                summary['ss_samples'] += 1
+            # Count soil types
+            soil_type = layer.get('soil_type', '')
+            if soil_type == 'clay':
+                summary['clay_layers'] += 1
+            elif soil_type in ['sand', 'silt']:
+                summary['sand_layers'] += 1
+            # Count calculated parameters
+            if 'su_source' in layer and 'Calculated' in layer['su_source']:
+                summary['su_calculated'] += 1
+            if 'phi_source' in layer and 'Calculated' in layer['phi_source']:
+                summary['phi_calculated'] += 1
+            # Count clay consistency checks
+            if 'clay_consistency_check' in layer:
+                summary['clay_consistency_checks'] += 1
+                consistency_result = layer['clay_consistency_check']
+                if consistency_result.get('is_consistent', False):
+                    summary['consistent_clays'] += 1
+                else:
+                    summary['inconsistent_clays'] += 1
+        return summary

soil_visualizer.py ADDED Viewed

	@@ -0,0 +1,285 @@

+import matplotlib.pyplot as plt
+import plotly.graph_objects as go
+import plotly.express as px
+import pandas as pd
+import numpy as np
+import streamlit as st
+from config import SOIL_TYPES, STRENGTH_PARAMETERS
+class SoilProfileVisualizer:
+    def __init__(self):
+        self.soil_colors = {
+            "soft clay": "#8B4513",
+            "medium clay": "#A0522D",
+            "stiff clay": "#D2691E",
+            "very stiff clay": "#CD853F",
+            "hard clay": "#DEB887",
+            "loose sand": "#F4A460",
+            "medium dense sand": "#DAA520",
+            "dense sand": "#B8860B",
+            "very dense sand": "#CD853F",
+            "soft silt": "#DDA0DD",
+            "medium silt": "#BA55D3",
+            "stiff silt": "#9370DB",
+            "loose gravel": "#696969",
+            "dense gravel": "#2F4F4F",
+            "weathered rock": "#708090",
+            "soft rock": "#2F4F4F",
+            "hard rock": "#36454F"
+        }
+    def create_soil_profile_plot(self, soil_data):
+        """Create interactive soil profile visualization"""
+        if not soil_data or "soil_layers" not in soil_data:
+            return None
+        layers = soil_data["soil_layers"]
+        fig = go.Figure()
+        # Add soil layers
+        for i, layer in enumerate(layers):
+            depth_from = layer.get("depth_from", 0)
+            depth_to = layer.get("depth_to", 0)
+            soil_type = layer.get("soil_type", "unknown")
+            description = layer.get("description", "")
+            strength_value = layer.get("strength_value", "N/A")
+            strength_param = layer.get("strength_parameter", "")
+            # Get color
+            color = self.soil_colors.get(soil_type.lower(), "#CCCCCC")
+            # Create layer rectangle
+            fig.add_shape(
+                type="rect",
+                x0=0, x1=1,
+                y0=-depth_to, y1=-depth_from,
+                fillcolor=color,
+                line=dict(color="black", width=1),
+                opacity=0.8
+            )
+            # Add layer text with enhanced parameters
+            mid_depth = -(depth_from + depth_to) / 2
+            # Build text with available parameters
+            text_lines = [f"{layer.get('consistency', '')} {soil_type}".strip()]
+            # Add strength parameters
+            if strength_param and strength_value is not None:
+                text_lines.append(f"{strength_param}: {strength_value}")
+            # Add calculated Su if available
+            if layer.get("calculated_su"):
+                text_lines.append(f"Su: {layer['calculated_su']:.0f} kPa*")
+            # Add friction angle if available
+            if layer.get("friction_angle"):
+                text_lines.append(f"φ: {layer['friction_angle']:.1f}°*")
+            fig.add_annotation(
+                x=0.5, y=mid_depth,
+                text="<br>".join(text_lines),
+                showarrow=False,
+                font=dict(size=9, color="white"),
+                bgcolor="rgba(0,0,0,0.6)",
+                bordercolor="white",
+                borderwidth=1
+            )
+        # Add depth markers
+        max_depth = max([layer.get("depth_to", 0) for layer in layers])
+        depth_ticks = list(range(0, int(max_depth) + 5, 5))
+        fig.update_layout(
+            title="Soil Profile",
+            xaxis=dict(
+                range=[0, 1],
+                showticklabels=False,
+                showgrid=False,
+                zeroline=False
+            ),
+            yaxis=dict(
+                title="Depth (m)",
+                range=[-max_depth - 2, 2],
+                tickvals=[-d for d in depth_ticks],
+                ticktext=[str(d) for d in depth_ticks],
+                showgrid=True,
+                gridcolor="lightgray"
+            ),
+            width=400,
+            height=600,
+            margin=dict(l=50, r=50, t=50, b=50)
+        )
+        # Add water table if present
+        if "water_table" in soil_data and soil_data["water_table"].get("depth"):
+            wt_depth = soil_data["water_table"]["depth"]
+            fig.add_hline(
+                y=-wt_depth,
+                line_dash="dash",
+                line_color="blue",
+                annotation_text="Water Table",
+                annotation_position="right"
+            )
+        return fig
+    def create_strength_profile_plot(self, soil_data):
+        """Create strength parameter vs depth plot"""
+        if not soil_data or "soil_layers" not in soil_data:
+            return None
+        layers = soil_data["soil_layers"]
+        depths = []
+        strengths = []
+        soil_types = []
+        for layer in layers:
+            depth_from = layer.get("depth_from", 0)
+            depth_to = layer.get("depth_to", 0)
+            strength_value = layer.get("strength_value")
+            soil_type = layer.get("soil_type", "")
+            if strength_value is not None:
+                mid_depth = (depth_from + depth_to) / 2
+                depths.append(mid_depth)
+                strengths.append(strength_value)
+                soil_types.append(soil_type)
+        if not depths:
+            return None
+        fig = go.Figure()
+        # Group by parameter type
+        clay_depths = []
+        clay_strengths = []
+        sand_depths = []
+        sand_strengths = []
+        for i, soil_type in enumerate(soil_types):
+            if "clay" in soil_type.lower():
+                clay_depths.append(depths[i])
+                clay_strengths.append(strengths[i])
+            else:
+                sand_depths.append(depths[i])
+                sand_strengths.append(strengths[i])
+        # Add traces
+        if clay_depths:
+            # Create custom hover text for Su values
+            clay_hover_text = [f"Depth: {d:.1f}m<br>Su: {s:.1f} kPa" for d, s in zip(clay_depths, clay_strengths)]
+            fig.add_trace(go.Scatter(
+                x=clay_strengths,
+                y=clay_depths,
+                mode='markers+lines',
+                name='Su (kPa)',
+                marker=dict(color='brown', size=8),
+                line=dict(color='brown'),
+                hovertemplate='%{customdata}<extra></extra>',
+                customdata=clay_hover_text
+            ))
+        if sand_depths:
+            # Create custom hover text for SPT-N values
+            sand_hover_text = [f"Depth: {d:.1f}m<br>SPT-N: {s:.0f} blows/30cm" for d, s in zip(sand_depths, sand_strengths)]
+            fig.add_trace(go.Scatter(
+                x=sand_strengths,
+                y=sand_depths,
+                mode='markers+lines',
+                name='SPT-N (blows/30cm)',
+                marker=dict(color='gold', size=8),
+                line=dict(color='gold'),
+                hovertemplate='%{customdata}<extra></extra>',
+                customdata=sand_hover_text
+            ))
+        # Determine primary axis title based on data
+        if clay_depths and sand_depths:
+            xaxis_title = "Strength Value (Su in kPa / SPT-N)"
+        elif clay_depths:
+            xaxis_title = "Undrained Shear Strength, Su (kPa)"
+        elif sand_depths:
+            xaxis_title = "SPT-N Value (blows/30cm)"
+        else:
+            xaxis_title = "Strength Value"
+        fig.update_layout(
+            title="Strength Parameters vs Depth",
+            xaxis_title=xaxis_title,
+            yaxis_title="Depth (m)",
+            yaxis=dict(autorange='reversed'),
+            width=500,
+            height=600,
+            showlegend=True,
+            legend=dict(
+                yanchor="top",
+                y=0.99,
+                xanchor="left",
+                x=0.01
+            )
+        )
+        return fig
+    def create_layer_summary_table(self, soil_data):
+        """Create summary table of soil layers"""
+        if not soil_data or "soil_layers" not in soil_data:
+            return None
+        layers = soil_data["soil_layers"]
+        df_data = []
+        for layer in layers:
+            # Build strength info with units
+            strength_info = ""
+            if layer.get("strength_parameter") and layer.get("strength_value") is not None:
+                param = layer['strength_parameter']
+                value = layer['strength_value']
+                # Add units based on parameter type
+                if param == "Su":
+                    strength_info = f"Su: {value:.1f} kPa"
+                elif param == "SPT-N":
+                    strength_info = f"SPT-N: {value:.0f} blows/30cm"
+                else:
+                    strength_info = f"{param}: {value}"
+            # Add calculated parameters
+            calc_params = []
+            if layer.get("calculated_su"):
+                calc_params.append(f"Su: {layer['calculated_su']:.0f} kPa (calc)")
+            if layer.get("friction_angle"):
+                calc_params.append(f"φ: {layer['friction_angle']:.1f}° (calc)")
+            if calc_params:
+                strength_info += f" | {' | '.join(calc_params)}"
+            df_data.append({
+                "Layer": layer.get("layer_id", ""),
+                "Depth From (m)": layer.get("depth_from", ""),
+                "Depth To (m)": layer.get("depth_to", ""),
+                "Soil Type": f"{layer.get('consistency', '')} {layer.get('soil_type', '')}".strip(),
+                "Description": layer.get("description", ""),
+                "Strength Parameters": strength_info,
+                "Color": layer.get("color", ""),
+                "Moisture": layer.get("moisture", ""),
+                "Notes": layer.get("su_source", "") or layer.get("friction_angle_source", "") or ""
+            })
+        return pd.DataFrame(df_data)
+    def export_profile_data(self, soil_data, format="csv"):
+        """Export soil profile data"""
+        df = self.create_layer_summary_table(soil_data)
+        if format == "csv":
+            return df.to_csv(index=False)
+        elif format == "json":
+            return df.to_json(orient="records", indent=2)
+        else:
+            return df.to_string(index=False)

unified_soil_workflow.py ADDED Viewed

	@@ -0,0 +1,1287 @@

+"""
+Unified Soil Analysis Workflow using LangGraph
+Combines LLM classification and SS/ST processing into a single controlled workflow
+"""
+import json
+from typing import Dict, List, Any, Optional, TypedDict, Annotated
+import streamlit as st
+from langgraph.graph import StateGraph, START, END
+from langgraph.graph.message import add_messages
+from langchain_core.messages import BaseMessage, HumanMessage, AIMessage
+import openai
+from soil_classification import SoilClassificationProcessor
+from soil_calculations import SoilCalculations
+from config import LLM_PROVIDERS, AVAILABLE_MODELS, get_default_provider_and_model, get_api_key
+class SoilAnalysisState(TypedDict):
+    """State for the unified soil analysis workflow"""
+    # Input data
+    text_content: Optional[str]
+    image_base64: Optional[str]
+    model: str
+    api_key: str
+    # Processing flags
+    merge_similar: bool
+    split_thick: bool
+    # LLM Analysis results
+    raw_llm_response: Optional[str]
+    llm_extraction_success: bool
+    extraction_errors: List[str]
+    retry_count: int  # Add retry counter
+    # Soil data (from LLM)
+    project_info: Dict[str, Any]
+    raw_soil_layers: List[Dict[str, Any]]
+    water_table: Dict[str, Any]
+    notes: str
+    # Processing results
+    processed_layers: List[Dict[str, Any]]
+    processing_summary: Dict[str, Any]
+    validation_stats: Dict[str, Any]
+    optimization_results: Dict[str, Any]
+    # Final output
+    final_soil_data: Dict[str, Any]
+    workflow_status: str
+    workflow_messages: Annotated[List[BaseMessage], add_messages]
+class UnifiedSoilWorkflow:
+    """
+    Unified LangGraph workflow for soil analysis
+    Combines LLM extraction and SS/ST processing into one controlled flow
+    """
+    def __init__(self):
+        self.soil_processor = SoilClassificationProcessor()
+        self.soil_calculator = SoilCalculations()
+        self.workflow = self._build_workflow()
+    def _get_provider_from_model(self, model: str) -> str:
+        """Determine provider from model name"""
+        for model_id, model_info in AVAILABLE_MODELS.items():
+            if model_id == model:
+                # Return the first provider that supports this model
+                providers = model_info.get("providers", [])
+                if providers:
+                    return providers[0]
+        # Default fallback logic based on model prefix
+        if model.startswith("anthropic/"):
+            return "anthropic"
+        elif model.startswith("google/"):
+            return "google"
+        else:
+            return "openrouter"  # Default to OpenRouter for other models
+    def _build_workflow(self) -> StateGraph:
+        """Build the unified LangGraph workflow"""
+        # Create workflow graph
+        workflow = StateGraph(SoilAnalysisState)
+        # Add nodes
+        workflow.add_node("validate_inputs", self._validate_inputs)
+        workflow.add_node("extract_with_llm", self._extract_with_llm)
+        workflow.add_node("validate_extraction", self._validate_extraction)
+        workflow.add_node("process_ss_st_classification", self._process_ss_st_classification)
+        workflow.add_node("apply_unit_conversions", self._apply_unit_conversions)
+        workflow.add_node("validate_soil_classification", self._validate_soil_classification)
+        workflow.add_node("calculate_parameters", self._calculate_parameters)
+        workflow.add_node("optimize_layers", self._optimize_layers)
+        workflow.add_node("finalize_results", self._finalize_results)
+        workflow.add_node("handle_errors", self._handle_errors)
+        # Define workflow edges
+        workflow.add_edge(START, "validate_inputs")
+        # Conditional routing based on validation
+        workflow.add_conditional_edges(
+            "validate_inputs",
+            self._should_continue_after_validation,
+            {
+                "continue": "extract_with_llm",
+                "error": "handle_errors"
+            }
+        )
+        workflow.add_edge("extract_with_llm", "validate_extraction")
+        # Simplified routing - no retry loop to prevent recursion
+        workflow.add_conditional_edges(
+            "validate_extraction",
+            self._should_continue_after_extraction,
+            {
+                "continue": "process_ss_st_classification",
+                "error": "handle_errors"
+            }
+        )
+        workflow.add_edge("process_ss_st_classification", "apply_unit_conversions")
+        workflow.add_edge("apply_unit_conversions", "validate_soil_classification")
+        workflow.add_edge("validate_soil_classification", "calculate_parameters")
+        workflow.add_edge("calculate_parameters", "optimize_layers")
+        workflow.add_edge("finalize_results", END)
+        workflow.add_edge("optimize_layers", "finalize_results")
+        workflow.add_edge("handle_errors", END)
+        return workflow.compile()
+    def _validate_inputs(self, state: SoilAnalysisState) -> SoilAnalysisState:
+        """Validate input data and configuration"""
+        st.info("🔍 Step 1: Validating inputs...")
+        errors = []
+        # Validate API key
+        if not state.get("api_key"):
+            errors.append("No API key provided")
+        # Validate content
+        if not state.get("text_content") and not state.get("image_base64"):
+            errors.append("No text or image content provided")
+        # Validate model (allow custom models not in AVAILABLE_MODELS)
+        _, default_model = get_default_provider_and_model()
+        model = state.get("model", default_model)
+        if not model or not isinstance(model, str):
+            errors.append(f"Invalid model format: {model}")
+        elif model not in AVAILABLE_MODELS:
+            # Allow custom models - just log info
+            st.info(f"📋 Using custom model: {model} (not in pre-configured list)")
+        if errors:
+            state["extraction_errors"] = errors
+            state["workflow_status"] = "validation_failed"
+            state["workflow_messages"] = [HumanMessage(content=f"Validation errors: {', '.join(errors)}")]
+        else:
+            state["workflow_status"] = "validated"
+            state["workflow_messages"] = [HumanMessage(content="Input validation passed")]
+            st.success("✅ Input validation passed")
+        return state
+    def _extract_with_llm(self, state: SoilAnalysisState) -> SoilAnalysisState:
+        """Extract soil data using LLM with enhanced prompts"""
+        retry_count = state.get("retry_count", 0)
+        st.info(f"🤖 Step 2: Extracting soil data with LLM... (attempt {retry_count + 1})")
+        try:
+            # Determine provider and base URL from model
+            provider_id = self._get_provider_from_model(state["model"])
+            base_url = LLM_PROVIDERS[provider_id]["base_url"]
+            # Initialize OpenAI client with correct provider
+            client = openai.OpenAI(
+                base_url=base_url,
+                api_key=state["api_key"]
+            )
+            # Enhanced system prompt with all requirements - use safer version for Gemini
+            if "gemini" in state["model"].lower():
+                system_prompt = self._get_gemini_safe_prompt()
+                st.info("🔧 Using Gemini-optimized prompt to avoid content filtering")
+            else:
+                system_prompt = self._get_unified_system_prompt()
+            # Build messages
+            messages = [{"role": "system", "content": system_prompt}]
+            # Add content
+            if state.get("text_content"):
+                messages.append({
+                    "role": "user",
+                    "content": f"Please analyze this soil boring log text:\n\n{state['text_content']}"
+                })
+            # Add image if supported and available
+            model_info = AVAILABLE_MODELS.get(state["model"], {})
+            # For custom models, assume image support (user responsibility)
+            supports_images = model_info.get('supports_images', True) if state["model"] not in AVAILABLE_MODELS else model_info.get('supports_images', False)
+            if state.get("image_base64") and supports_images:
+                messages.append({
+                    "role": "user",
+                    "content": [
+                        {"type": "text", "text": "Please analyze this soil boring log image:"},
+                        {
+                            "type": "image_url",
+                            "image_url": {"url": f"data:image/png;base64,{state['image_base64']}"}
+                        }
+                    ]
+                })
+            # Call LLM with detailed error handling
+            st.info(f"🔗 Making API call to {state['model']}...")
+            st.info(f"📝 Message count: {len(messages)}, Max tokens: 3000")
+            try:
+                response = client.chat.completions.create(
+                    model=state["model"],
+                    messages=messages,
+                    max_tokens=3000,
+                    temperature=0.1
+                )
+                # Debug response structure
+                st.info(f"🔍 Response received - Choices count: {len(response.choices) if response and response.choices else 0}")
+                # Check if response is valid
+                if not response or not response.choices:
+                    raise Exception("No response received from LLM API")
+                raw_response = response.choices[0].message.content
+                # Debug response content
+                if raw_response is None:
+                    raise Exception("Response content is None")
+                elif not raw_response.strip():
+                    # Check if it's just whitespace/newlines
+                    if len(raw_response) > 0:
+                        whitespace_chars = [repr(c) for c in raw_response[:10]]
+                        raise Exception(f"Response contains only whitespace (length: {len(raw_response)}, chars: {whitespace_chars})")
+                    else:
+                        raise Exception("Completely empty response from LLM API")
+                # Check for very short responses that might indicate filtering
+                elif len(raw_response.strip()) < 10:
+                    st.warning(f"⚠️ Very short response ({len(raw_response)} chars): '{raw_response[:50]}'")
+                    st.info("💡 This might indicate content filtering. Try a simpler prompt or different model.")
+                state["raw_llm_response"] = raw_response
+                st.success(f"📥 Received response: {len(raw_response)} characters")
+            except Exception as api_error:
+                # Enhanced API error handling
+                error_msg = str(api_error)
+                st.error(f"❌ API call failed: {error_msg}")
+                # Check if it's a model-specific issue
+                if "not a valid model ID" in error_msg:
+                    st.error(f"🚫 Model '{state['model']}' is not available on OpenRouter")
+                    st.info("💡 Try using a different model like 'anthropic/claude-sonnet-4'")
+                elif "rate limit" in error_msg.lower():
+                    st.error("⏰ Rate limit exceeded. Please wait and try again.")
+                elif "empty" in error_msg.lower() or "none" in error_msg.lower():
+                    st.error("📭 Model returned empty response. This might be due to:")
+                    st.info("   • Content filtering by the model")
+                    st.info("   • Model configuration issues")
+                    st.info("   • Input content triggering safety filters")
+                    st.info("💡 Try a different model or simpler input text")
+                raise api_error
+            # Parse JSON response with enhanced error handling
+            soil_data = self._parse_llm_response(raw_response)
+            if "error" in soil_data:
+                state["llm_extraction_success"] = False
+                state["extraction_errors"] = [soil_data["error"]]
+                state["workflow_status"] = "extraction_failed"
+                st.error(f"❌ JSON parsing failed: {soil_data['error']}")
+            else:
+                # Validate that we have basic required data
+                layers = soil_data.get("soil_layers", [])
+                if not layers:
+                    state["llm_extraction_success"] = False
+                    state["extraction_errors"] = ["No soil layers found in LLM response"]
+                    state["workflow_status"] = "extraction_failed"
+                    st.error("❌ No soil layers found in LLM response")
+                else:
+                    state["llm_extraction_success"] = True
+                    state["project_info"] = soil_data.get("project_info", {})
+                    state["raw_soil_layers"] = layers
+                    state["water_table"] = soil_data.get("water_table", {})
+                    state["notes"] = soil_data.get("notes", "")
+                    state["workflow_status"] = "extracted"
+                    st.success(f"✅ LLM extraction completed: {len(layers)} layers found")
+        except Exception as e:
+            state["llm_extraction_success"] = False
+            state["extraction_errors"] = [str(e)]
+            state["workflow_status"] = "extraction_error"
+            st.error(f"❌ LLM extraction failed: {str(e)}")
+        state["workflow_messages"] = state.get("workflow_messages", []) + [
+            AIMessage(content=f"LLM extraction: {'success' if state['llm_extraction_success'] else 'failed'}")
+        ]
+        return state
+    def _validate_extraction(self, state: SoilAnalysisState) -> SoilAnalysisState:
+        """Validate LLM extraction results"""
+        st.info("🔍 Step 3: Validating extraction results...")
+        if not state["llm_extraction_success"]:
+            return state
+        validation_errors = []
+        # Check for required data
+        if not state["raw_soil_layers"]:
+            validation_errors.append("No soil layers extracted")
+        # Validate layer structure
+        for i, layer in enumerate(state["raw_soil_layers"]):
+            if "depth_from" not in layer or "depth_to" not in layer:
+                validation_errors.append(f"Layer {i+1}: Missing depth information")
+            if "soil_type" not in layer:
+                validation_errors.append(f"Layer {i+1}: Missing soil type")
+        if validation_errors:
+            state["extraction_errors"] = validation_errors
+            state["workflow_status"] = "extraction_failed"  # Use consistent status name
+            st.warning(f"⚠️ Validation issues found: {len(validation_errors)} errors")
+        else:
+            state["workflow_status"] = "extraction_validated"
+            st.success("✅ Extraction validation passed")
+        return state
+    def _process_ss_st_classification(self, state: SoilAnalysisState) -> SoilAnalysisState:
+        """Process SS/ST sample classification"""
+        st.info("🧪 Step 4: Processing SS/ST sample classification...")
+        try:
+            processed_layers = self.soil_processor.process_soil_layers(state["raw_soil_layers"])
+            state["processed_layers"] = processed_layers
+            state["workflow_status"] = "ss_st_processed"
+            st.success(f"✅ SS/ST processing completed: {len(processed_layers)} layers processed")
+        except Exception as e:
+            state["extraction_errors"] = state.get("extraction_errors", []) + [f"SS/ST processing error: {str(e)}"]
+            state["workflow_status"] = "ss_st_error"
+            st.error(f"❌ SS/ST processing failed: {str(e)}")
+        return state
+    def _apply_unit_conversions(self, state: SoilAnalysisState) -> SoilAnalysisState:
+        """Apply unit conversions to all measurements"""
+        st.info("🔧 Step 5: Applying unit conversions...")
+        try:
+            converted_layers = []
+            unit_warnings = []
+            for layer in state["processed_layers"]:
+                converted_layer = self.soil_processor._convert_to_si_units(layer)
+                converted_layers.append(converted_layer)
+                # Collect unit validation warnings
+                if converted_layer.get('unit_validation_warning'):
+                    unit_warnings.append(f"Layer {layer.get('layer_id', '?')}: {converted_layer['unit_validation_warning']}")
+            state["processed_layers"] = converted_layers
+            state["workflow_status"] = "units_converted"
+            # Track different types of validation issues
+            unit_errors = []
+            recheck_needed = []
+            critical_errors = []
+            for layer in converted_layers:
+                validation_warning = layer.get('unit_validation_warning', '')
+                if validation_warning:
+                    layer_id = layer.get('layer_id', '?')
+                    # Check if this layer needs image recheck
+                    if hasattr(self.soil_processor, '_validate_su_with_water_content'):
+                        detailed_validation = self.soil_processor._validate_su_with_water_content(layer)
+                        if detailed_validation.get('critical_unit_error'):
+                            critical_errors.append(f"Layer {layer_id}: {detailed_validation.get('suggested_conversion', 'Unit error')}")
+                        if detailed_validation.get('recheck_image'):
+                            recheck_needed.append(f"Layer {layer_id}: {validation_warning}")
+                        else:
+                            unit_errors.append(f"Layer {layer_id}: {validation_warning}")
+            # Display different types of issues with appropriate severity
+            if critical_errors:
+                st.error("🚨 CRITICAL UNIT CONVERSION ERRORS DETECTED:")
+                for error in critical_errors:
+                    st.error(f"  • {error}")
+                st.error("⚠️ These values appear to be in wrong units - conversion may be needed!")
+            if recheck_needed:
+                st.warning("📷 IMAGE RECHECK RECOMMENDED:")
+                for recheck in recheck_needed:
+                    st.warning(f"  • {recheck}")
+                st.info("💡 Su-water content values seem inconsistent - consider reloading the image")
+            if unit_errors:
+                st.warning("⚠️ Su-water content validation issues:")
+                for error in unit_errors:
+                    st.info(f"  • {error}")
+            # Store all warnings for later reference
+            all_warnings = critical_errors + recheck_needed + unit_errors
+            if all_warnings:
+                state["unit_validation_warnings"] = all_warnings
+                state["needs_image_recheck"] = len(recheck_needed) > 0
+                state["has_critical_unit_errors"] = len(critical_errors) > 0
+                # Add to final results for user action
+                state["validation_recommendations"] = {
+                    "critical_unit_errors": critical_errors,
+                    "recheck_image": recheck_needed,
+                    "general_warnings": unit_errors
+                }
+            else:
+                st.success("✅ Unit conversions applied - all Su-water content correlations look reasonable")
+        except Exception as e:
+            state["extraction_errors"] = state.get("extraction_errors", []) + [f"Unit conversion error: {str(e)}"]
+            state["workflow_status"] = "conversion_error"
+            st.error(f"❌ Unit conversion failed: {str(e)}")
+        return state
+    def _validate_soil_classification(self, state: SoilAnalysisState) -> SoilAnalysisState:
+        """Validate soil classification with sieve analysis requirements"""
+        st.info("🎯 Step 6: Validating soil classification...")
+        try:
+            validated_layers = []
+            classification_warnings = []
+            for layer in state["processed_layers"]:
+                # Apply enhanced soil classification validation
+                validated_layer = layer.copy()
+                # Re-classify with strict sieve analysis requirements
+                soil_type = self.soil_processor._classify_soil_type(validated_layer)
+                validated_layer["soil_type"] = soil_type
+                # Track classification changes
+                if layer.get("soil_type") != soil_type:
+                    classification_warnings.append(
+                        f"Layer {layer.get('layer_id', '?')}: Changed from '{layer.get('soil_type')}' to '{soil_type}'"
+                    )
+                validated_layers.append(validated_layer)
+            state["processed_layers"] = validated_layers
+            state["workflow_status"] = "classification_validated"
+            if classification_warnings:
+                st.warning(f"⚠️ Classification changes: {len(classification_warnings)} layers updated")
+                for warning in classification_warnings:
+                    st.info(f"  • {warning}")
+            else:
+                st.success("✅ Soil classification validation passed")
+        except Exception as e:
+            state["extraction_errors"] = state.get("extraction_errors", []) + [f"Classification validation error: {str(e)}"]
+            state["workflow_status"] = "classification_error"
+            st.error(f"❌ Classification validation failed: {str(e)}")
+        return state
+    def _calculate_parameters(self, state: SoilAnalysisState) -> SoilAnalysisState:
+        """Calculate engineering parameters (Su, φ, etc.)"""
+        st.info("📊 Step 7: Calculating engineering parameters...")
+        try:
+            enhanced_layers = self.soil_calculator.enhance_soil_layers(state["processed_layers"])
+            # Enhanced post-processing for multiple Su values
+            enhanced_layers = self._process_multiple_su_values(enhanced_layers)
+            state["processed_layers"] = enhanced_layers
+            state["workflow_status"] = "parameters_calculated"
+            st.success("✅ Engineering parameters calculated")
+        except Exception as e:
+            state["extraction_errors"] = state.get("extraction_errors", []) + [f"Parameter calculation error: {str(e)}"]
+            state["workflow_status"] = "calculation_error"
+            st.error(f"❌ Parameter calculation failed: {str(e)}")
+        return state
+    def _process_multiple_su_values(self, layers: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
+        """Process layers that may have multiple Su values and decide on subdivision"""
+        enhanced_layers = []
+        for layer in layers:
+            # Check if layer description mentions multiple Su values
+            description = layer.get('description', '').lower()
+            # Look for patterns indicating multiple Su values
+            import re
+            # Pattern to find multiple Su values in description
+            su_pattern = r'su[=\s]*(\d+(?:\.\d+)?)\s*(?:kpa|kPa|t/m²|ksc|psi)'
+            su_values = re.findall(su_pattern, description)
+            # Pattern to find Su ranges
+            range_pattern = r'su\s*(?:ranges?|from)\s*(\d+(?:\.\d+)?)\s*(?:-|to)\s*(\d+(?:\.\d+)?)\s*(?:kpa|kPa)'
+            range_match = re.search(range_pattern, description)
+            # Pattern to find averaged Su values
+            avg_pattern = r'su\s*(?:averaged|average|mean)\s*(?:from)?\s*(?:\d+\s*measurements?)?\s*[:\s]*(\d+(?:\.\d+)?)'
+            avg_match = re.search(avg_pattern, description)
+            if len(su_values) > 1:
+                # Multiple Su values found - decide on subdivision or averaging
+                su_nums = [float(val) for val in su_values]
+                # Check variation
+                min_su = min(su_nums)
+                max_su = max(su_nums)
+                avg_su = sum(su_nums) / len(su_nums)
+                variation = (max_su - min_su) / avg_su if avg_su > 0 else 0
+                if variation > 0.5 or max_su / min_su > 2.0:
+                    # High variation - suggest layer subdivision
+                    layer['subdivision_suggested'] = True
+                    layer['su_variation_high'] = True
+                    layer['su_values_found'] = su_nums
+                    layer['su_variation_ratio'] = max_su / min_su if min_su > 0 else 0
+                    layer['subdivision_reason'] = f"High Su variation: {min_su:.1f}-{max_su:.1f} kPa (ratio: {max_su/min_su:.1f}x)"
+                    # Update description to highlight the issue
+                    layer['description'] += f" [SUBDIVISION RECOMMENDED: Su varies {min_su:.1f}-{max_su:.1f} kPa]"
+                    st.warning(f"🔄 Layer {layer.get('layer_id', '?')}: High Su variation detected - subdivision recommended")
+                else:
+                    # Low variation - use average
+                    layer['su_averaged'] = True
+                    layer['su_values_found'] = su_nums
+                    layer['su_average_used'] = avg_su
+                    layer['strength_value'] = avg_su
+                    layer['description'] += f" [Su averaged from {len(su_nums)} values: {', '.join([f'{v:.1f}' for v in su_nums])} kPa → {avg_su:.1f} kPa]"
+                    st.info(f"📊 Layer {layer.get('layer_id', '?')}: Averaged {len(su_nums)} Su values: {avg_su:.1f} kPa")
+            elif range_match:
+                # Su range found
+                min_su = float(range_match.group(1))
+                max_su = float(range_match.group(2))
+                avg_su = (min_su + max_su) / 2
+                layer['su_range_found'] = True
+                layer['su_range'] = [min_su, max_su]
+                layer['su_range_average'] = avg_su
+                layer['strength_value'] = avg_su
+                layer['description'] += f" [Su range {min_su:.1f}-{max_su:.1f} kPa, using average {avg_su:.1f} kPa]"
+                st.info(f"📊 Layer {layer.get('layer_id', '?')}: Su range processed, using average {avg_su:.1f} kPa")
+            elif avg_match:
+                # Averaged Su value already mentioned
+                avg_su = float(avg_match.group(1))
+                layer['su_pre_averaged'] = True
+                layer['su_average_value'] = avg_su
+                layer['strength_value'] = avg_su
+            # Add metadata for tracking
+            layer['su_processing_applied'] = True
+            enhanced_layers.append(layer)
+        return enhanced_layers
+    def _optimize_layers(self, state: SoilAnalysisState) -> SoilAnalysisState:
+        """Optimize layer division and grouping"""
+        st.info("⚙️ Step 8: Optimizing layer division...")
+        try:
+            from soil_analyzer import SoilLayerAnalyzer
+            analyzer = SoilLayerAnalyzer()
+            # Validate layer continuity
+            validated_layers = analyzer.validate_layer_continuity(state["processed_layers"])
+            # Calculate statistics
+            stats = analyzer.calculate_layer_statistics(validated_layers)
+            state["validation_stats"] = stats
+            # Optimize layer division
+            optimization = analyzer.optimize_layer_division(
+                validated_layers,
+                merge_similar=state.get("merge_similar", True),
+                split_thick=state.get("split_thick", True)
+            )
+            state["optimization_results"] = optimization
+            # Use optimized layers
+            state["processed_layers"] = optimization.get("optimized_layers", validated_layers)
+            state["workflow_status"] = "optimized"
+            st.success("✅ Layer optimization completed")
+        except Exception as e:
+            state["extraction_errors"] = state.get("extraction_errors", []) + [f"Optimization error: {str(e)}"]
+            state["workflow_status"] = "optimization_error"
+            st.error(f"❌ Layer optimization failed: {str(e)}")
+        return state
+    def _finalize_results(self, state: SoilAnalysisState) -> SoilAnalysisState:
+        """Finalize and package results"""
+        st.info("📦 Step 9: Finalizing results...")
+        try:
+            # Generate processing summary
+            processing_summary = self.soil_processor.get_processing_summary(state["processed_layers"])
+            state["processing_summary"] = processing_summary
+            # Package final results
+            final_soil_data = {
+                "project_info": state["project_info"],
+                "soil_layers": state["processed_layers"],
+                "water_table": state["water_table"],
+                "notes": state["notes"],
+                "processing_summary": processing_summary,
+                "validation_stats": state.get("validation_stats", {}),
+                "optimization_results": state.get("optimization_results", {}),
+                "workflow_metadata": {
+                    "model_used": state["model"],
+                    "processing_steps": 9,
+                    "total_layers": len(state["processed_layers"]),
+                    "ss_samples": processing_summary.get("ss_samples", 0),
+                    "st_samples": processing_summary.get("st_samples", 0)
+                }
+            }
+            state["final_soil_data"] = final_soil_data
+            state["workflow_status"] = "completed"
+            st.success("🎉 Unified soil analysis workflow completed successfully!")
+        except Exception as e:
+            state["extraction_errors"] = state.get("extraction_errors", []) + [f"Finalization error: {str(e)}"]
+            state["workflow_status"] = "finalization_error"
+            st.error(f"❌ Result finalization failed: {str(e)}")
+        return state
+    def _handle_errors(self, state: SoilAnalysisState) -> SoilAnalysisState:
+        """Handle workflow errors"""
+        st.error("❌ Workflow encountered errors")
+        errors = state.get("extraction_errors", [])
+        for error in errors:
+            st.error(f"  • {error}")
+        state["workflow_status"] = "failed"
+        state["final_soil_data"] = {
+            "error": "Workflow failed",
+            "errors": errors,
+            "raw_response": state.get("raw_llm_response", "")
+        }
+        return state
+    # Conditional routing functions
+    def _should_continue_after_validation(self, state: SoilAnalysisState) -> str:
+        """Determine next step after input validation"""
+        if state["workflow_status"] == "validated":
+            return "continue"
+        else:
+            return "error"
+    def _should_continue_after_extraction(self, state: SoilAnalysisState) -> str:
+        """Determine next step after LLM extraction - simplified without retry loops"""
+        workflow_status = state.get("workflow_status", "unknown")
+        if workflow_status == "extraction_validated":
+            st.info("✅ Proceeding to SS/ST classification...")
+            return "continue"
+        else:
+            st.error(f"❌ Extraction validation failed with status: {workflow_status}")
+            return "error"
+    def _get_gemini_safe_prompt(self) -> str:
+        """Get a simplified, safer prompt for Gemini models to avoid content filtering"""
+        return """You are a geotechnical engineer analyzing soil data.
+Extract information from soil boring logs and return ONLY valid JSON.
+Required JSON format:
+{
+  "project_info": {
+    "project_name": "string",
+    "boring_id": "string",
+    "location": "string",
+    "date": "string",
+    "depth_total": 10.0
+  },
+  "soil_layers": [
+    {
+      "layer_id": 1,
+      "depth_from": 0.0,
+      "depth_to": 2.0,
+      "soil_type": "clay",
+      "description": "description text",
+      "sample_type": "SS",
+      "strength_parameter": "SPT-N",
+      "strength_value": 15,
+      "water_content": 25,
+      "color": "brown",
+      "consistency": "soft"
+    }
+  ],
+  "water_table": {"depth": 3.0, "date_encountered": "2024-01-01"},
+  "notes": "Additional notes"
+}
+Key rules:
+1. Look for SS-* or ST-* sample identifiers in first column
+2. SS samples use SPT-N values, ST samples use Su values
+3. **CRITICAL - READ COLUMN HEADERS FOR UNITS**:
+   Look at table headers to identify Su units:
+   - If header shows "Su t/m²" or "Su (t/m²)" → Units are t/m²
+   - If header shows "Su kPa" or "Su (kPa)" → Units are kPa
+   - If header shows "Su ksc" or "Su (ksc)" → Units are ksc
+4. **CAREFULLY convert Su units to kPa BASED ON HEADER**:
+   - t/m² → kPa: multiply by 9.81 (CRITICAL - MOST COMMON ERROR)
+   - ksc/kg/cm² → kPa: multiply by 98.0
+   - psi → kPa: multiply by 6.895
+   - MPa → kPa: multiply by 1000
+   - kPa → kPa: no conversion (use directly)
+5. Extract water content when available
+6. Check Su-water content correlation (soft clay: Su<50kPa, w%>30%)
+7. Group similar layers (maximum 7 layers total)
+8. Return ONLY the JSON object, no explanatory text
+9. Start response with { and end with }"""
+    def _get_unified_system_prompt(self) -> str:
+        """Get the comprehensive system prompt for unified processing"""
+        return """You are an expert geotechnical engineer specializing in soil boring log interpretation.
+        IMPORTANT: You must respond with ONLY valid JSON data. Do not include any text before or after the JSON.
+        SAMPLE TYPE IDENTIFICATION (CRITICAL - FOLLOW EXACT ORDER):
+        **STEP 1 - FIRST COLUMN STRATIFICATION SYMBOLS (ABSOLUTE HIGHEST PRIORITY):**
+        ALWAYS look at the FIRST COLUMN of each layer for stratification symbols:
+        - **SS-1, SS-2, SS-18, SS18, SS-5** → SS (Split Spoon) sample
+        - **ST-1, ST-2, ST-5, ST5, ST-12** → ST (Shelby Tube) sample
+        - **SS1, SS2, SS3** (without dash) → SS sample
+        - **ST1, ST2, ST3** (without dash) → ST sample
+        - **Look for pattern: [SS|ST][-]?[0-9]+** in first column
+        **EXAMPLES of First Column Recognition:**
+        ```
+        SS-18 | Brown clay, N=8 → sample_type="SS" (SS-18 in first column)
+        ST-5  | Gray clay, Su=45 kPa → sample_type="ST" (ST-5 in first column)
+        SS12  | Sandy clay, SPT test → sample_type="SS" (SS12 in first column)
+        ST3   | Soft clay, unconfined → sample_type="ST" (ST3 in first column)
+        ```
+        **STEP 2 - If NO first column symbols, then check description keywords:**
+        - SS indicators: "split spoon", "SPT", "standard penetration", "disturbed"
+        - ST indicators: "shelby", "tube", "undisturbed", "UT", "unconfined compression"
+        **STEP 3 - If still unclear, use strength parameter type:**
+        - SPT-N values present → likely SS sample
+        - Su values from unconfined test → likely ST sample
+        CRITICAL SOIL CLASSIFICATION RULES (MANDATORY):
+        **SAND LAYER CLASSIFICATION REQUIREMENTS:**
+        1. **Sand layers MUST have sieve analysis evidence** - Look for:
+           - "Sieve #200: X% passing" or "#200 passing: X%"
+           - "Fines content: X%" (same as sieve #200)
+           - "Particle size analysis" or "gradation test"
+           - "% passing 0.075mm" (equivalent to #200 sieve)
+        2. **Classification Rules**:
+           - Sieve #200 >50% passing → CLAY (fine-grained)
+           - Sieve #200 <50% passing → SAND/GRAVEL (coarse-grained)
+        3. **NO SIEVE ANALYSIS = ASSUME CLAY (MANDATORY)**:
+           - If no sieve analysis data found → ALWAYS classify as CLAY
+           - Include note: "Assumed clay - no sieve analysis data available"
+           - Set sieve_200_passing: null (not a number)
+        **CRITICAL**: Never classify as sand/silt without explicit sieve analysis evidence
+        **CRITICAL**: Always look for sieve #200 data before classifying as sand
+        CRITICAL SS/ST SAMPLE RULES (MUST FOLLOW):
+        FOR SS (Split Spoon) SAMPLES:
+        1. ALWAYS use RAW N-VALUE (not N-corrected, N-correction, or adjusted N)
+        2. Look for: "N = 15", "SPT-N = 8", "raw N = 20", "field N = 12"
+        3. IGNORE: "N-corrected = 25", "N-correction = 18", "adjusted N = 30"
+        4. For clay: Use SPT-N parameter (will be converted to Su using Su=5*N)
+        5. For sand/silt: Use SPT-N parameter (will be converted to friction angle)
+        6. NEVER use unconfined compression Su values for SS samples - ONLY use N values
+        FOR ST (Shelby Tube) SAMPLES:
+        1. ALWAYS USE DIRECT Su values from unconfined compression test
+        2. If ST sample has Su value (e.g., "Su = 25 kPa"), use that EXACT value
+        3. NEVER convert SPT-N to Su for ST samples when direct Su is available
+        4. Priority: Direct Su measurement > any other value
+        CRITICAL SU VALUE EXTRACTION - MULTIPLE VALUES PER LAYER:
+        **EXTRACT ALL SU VALUES IN COLUMN (CRITICAL ENHANCEMENT):**
+        **STEP 1 - SCAN ENTIRE SU COLUMN FOR EACH LAYER:**
+        1. Look for ALL Su values that fall within each layer's depth range
+        2. Extract EVERY Su value found in the Su column for that depth interval
+        3. Record ALL values with their exact depths if specified
+        4. Note: A single layer may have multiple Su measurements at different depths
+        **STEP 2 - HANDLE MULTIPLE SU VALUES PER LAYER:**
+        For layers with multiple Su values, you have several options:
+        Option A - **LAYER SUBDIVISION (PREFERRED for significant variation):**
+        - If Su values vary by >50% or have >2x ratio → Split into sublayers
+        - Example: Layer 2.0-6.0m has Su values [25, 45, 80] kPa
+        - Split into: Layer 2.0-3.5m (Su=25kPa), Layer 3.5-5.0m (Su=45kPa), Layer 5.0-6.0m (Su=80kPa)
+        Option B - **AVERAGE SU VALUES (for similar values):**
+        - If Su values are within ±30% of mean → Use average
+        - Example: Layer 1.0-3.0m has Su values [35, 40, 38] kPa → Use Su=37.7kPa
+        - Include note: "Su averaged from 3 measurements: 35, 40, 38 kPa"
+        Option C - **REPRESENTATIVE VALUE (for clusters):**
+        - If multiple similar values with one outlier → Use cluster average
+        - Example: Su values [25, 28, 26, 45] → Use 26.3kPa (ignore outlier 45)
+        **STEP 3 - DOCUMENT ALL VALUES FOUND:**
+        Always include in description:
+        - "Su values found: 25, 35, 42 kPa (averaged to 34 kPa)"
+        - "Multiple Su measurements: 30, 28, 32 kPa at depths 2.1, 2.5, 2.8m"
+        - "Su ranges from 40-60 kPa, used average 50 kPa"
+        CRITICAL UNIT CONVERSION REQUIREMENTS (MUST APPLY):
+        **MANDATORY SU UNIT CONVERSION - READ COLUMN HEADERS FIRST:**
+        **STEP 1 - IDENTIFY UNITS FROM TABLE HEADERS (CRITICAL):**
+        ALWAYS look at the column headers to identify Su units:
+        - "Su t/m²" or "Su (t/m²)" in header → Values are in t/m²
+        - "Su kPa" or "Su (kPa)" in header → Values are in kPa
+        - "Su ksc" or "Su (ksc)" in header → Values are in ksc
+        - "Su psi" or "Su (psi)" in header → Values are in psi
+        - Just "Su" with units below → Look at unit row (e.g., "t/m²")
+        **STEP 2 - CONVERT TO kPa BASED ON IDENTIFIED UNITS:**
+        When extracting Su values from images or text, you MUST convert to kPa BEFORE using the value:
+        1. **ksc or kg/cm²**: Su_kPa = Su_ksc × 98.0
+           Example: "Su = 2.5 ksc" → strength_value: 245 (not 2.5)
+        2. **t/m² (tonnes/m²)**: Su_kPa = Su_tonnes × 9.81
+           Example: "Su = 3.0 t/m²" → strength_value: 29.43 (not 3.0)
+           **CRITICAL**: This is the MOST COMMON unit in boring logs!
+        3. **psi**: Su_kPa = Su_psi × 6.895
+           Example: "Su = 50 psi" → strength_value: 344.75 (not 50)
+        4. **psf**: Su_kPa = Su_psf × 0.048
+           Example: "Su = 1000 psf" → strength_value: 48 (not 1000)
+        5. **kPa**: Use directly (no conversion needed)
+           Example: "Su = 75 kPa" → strength_value: 75
+        6. **MPa**: Su_kPa = Su_MPa × 1000
+           Example: "Su = 0.1 MPa" → strength_value: 100 (not 0.1)
+        **CRITICAL EXAMPLES FROM BORING LOGS:**
+        - Table header shows "Su t/m²", value 1.41 → strength_value: 13.83 (1.41 × 9.81)
+        - Table header shows "Su t/m²", value 2.41 → strength_value: 23.64 (2.41 × 9.81)
+        - Table header shows "Su kPa", value 75 → strength_value: 75 (no conversion)
+        **IMPORTANT**: Always include original unit in description for verification
+        **SPT-N values**: Keep as-is (no unit conversion needed)
+        CRITICAL SU-WATER CONTENT VALIDATION (MANDATORY):
+        **EXTRACT WATER CONTENT WHEN AVAILABLE:**
+        Always extract water content (w%) when mentioned in the description:
+        - \"water content = 25%\" → water_content: 25
+        - \"w = 30%\" → water_content: 30
+        - \"moisture content 35%\" → water_content: 35
+        **VALIDATE SU-WATER CONTENT CORRELATION:**
+        For clay layers, Su and water content should correlate reasonably:
+        - Very soft clay: Su < 25 kPa, w% > 40%
+        - Soft clay: Su 25-50 kPa, w% 30-40%
+        - Medium clay: Su 50-100 kPa, w% 20-30%
+        - Stiff clay: Su 100-200 kPa, w% 15-25%
+        - Very stiff clay: Su 200-400 kPa, w% 10-20%
+        - Hard clay: Su > 400 kPa, w% < 15%
+        **CRITICAL UNIT CHECK SCENARIOS:**
+        - If Su > 1000 kPa with w% > 20%: CHECK if Su is in wrong units (psi, psf?)
+        - If Su < 5 kPa with w% < 15%: CHECK if Su is in wrong units (MPa, bar?)
+        - If correlation seems very off: VERIFY unit conversion was applied correctly
+        CRITICAL OUTPUT FORMAT (MANDATORY):
+        You MUST respond with ONLY a valid JSON object. Do not include:
+        - Explanatory text before or after the JSON
+        - Markdown formatting (```json ```)
+        - Comments or notes
+        - Multiple JSON objects
+        Start your response directly with { and end with }
+        EXAMPLE CORRECT RESPONSE FORMAT:
+        {
+          "project_info": {
+            "project_name": "Sample Project",
+            "boring_id": "BH-01",
+            "location": "Sample Location",
+            "date": "2024-06-25",
+            "depth_total": 10.0
+          },
+          "soil_layers": [
+            {
+              "layer_id": 1,
+              "depth_from": 0.0,
+              "depth_to": 2.0,
+              "soil_type": "clay",
+              "description": "Brown clay, soft, SS-1 sample",
+              "sample_type": "SS",
+              "strength_parameter": "SPT-N",
+              "strength_value": 4,
+              "water_content": 35,
+              "color": "brown",
+              "consistency": "soft"
+            }
+          ],
+          "water_table": {"depth": 3.0, "date_encountered": "2024-06-25"},
+          "notes": "Standard soil boring analysis"
+        }
+        LAYER GROUPING REQUIREMENTS:
+        1. MAXIMUM 7 LAYERS TOTAL - Group similar adjacent layers to achieve this limit
+        2. CLAY AND SAND MUST BE SEPARATE - Never combine clay layers with sand layers
+        3. Group adjacent layers with similar properties (same soil type and similar consistency)
+        4. Prioritize engineering significance over minor variations
+        Analyze the provided soil boring log and extract the following information in this exact JSON format:
+        {
+            "project_info": {
+                "project_name": "string",
+                "boring_id": "string",
+                "location": "string",
+                "date": "string",
+                "depth_total": 10.0
+            },
+            "soil_layers": [
+                {
+                    "layer_id": 1,
+                    "depth_from": 0.0,
+                    "depth_to": 2.5,
+                    "soil_type": "clay",
+                    "description": "Brown silty clay, ST sample, Su = 25 kPa",
+                    "sample_type": "ST",
+                    "strength_parameter": "Su",
+                    "strength_value": 25,
+                    "sieve_200_passing": 65,
+                    "water_content": 35.5,
+                    "color": "brown",
+                    "moisture": "moist",
+                    "consistency": "soft",
+                    "su_source": "Unconfined Compression Test"
+                }
+            ],
+            "water_table": {
+                "depth": 3.0,
+                "date_encountered": "2024-01-01"
+            },
+            "notes": "Additional observations"
+        }
+        **CRITICAL EXAMPLES - MULTIPLE SU VALUES PER LAYER:**
+        **EXAMPLE 1 - Multiple Su Values (SUBDIVISION CASE):**
+        Layer depth 2.0-6.0m with Su column showing:
+        - "Su at 2.5m = 25 kPa"
+        - "Su at 4.0m = 45 kPa"
+        - "Su at 5.5m = 80 kPa"
+        PROCESSING: High variation (25-80 kPa, ratio 3.2x) → SUBDIVISION RECOMMENDED
+        → Include ALL values in description: "Multiple Su values: 25, 45, 80 kPa [SUBDIVISION RECOMMENDED: High variation]"
+        → Use representative value (middle): strength_value=45
+        → Add metadata: subdivision_suggested=true, su_variation_high=true
+        **EXAMPLE 2 - Multiple Similar Su Values (AVERAGING CASE):**
+        Layer depth 1.0-3.0m with Su column showing:
+        - "Su = 35 kPa"
+        - "Su = 40 kPa"
+        - "Su = 38 kPa"
+        PROCESSING: Low variation (±7% from mean) → USE AVERAGE
+        → Description: "Su averaged from 3 measurements: 35, 40, 38 kPa → 37.7 kPa"
+        → Use: strength_value=37.7
+        **EXAMPLE 3 - Su Range Detection:**
+        Layer with Su column: "Su ranges 40-60 kPa"
+        → Description: "Su range 40-60 kPa, using average 50 kPa"
+        → Use: strength_value=50
+        EXAMPLES OF CORRECT FIRST COLUMN SYMBOL RECOGNITION:
+        **SS SAMPLE EXAMPLES (First Column Priority):**
+        1. "SS-18 | Clay layer, N = 8, Su = 45 kPa from unconfined test"
+           → First column: SS-18 → sample_type="SS" (HIGHEST PRIORITY)
+           → Use: strength_parameter="SPT-N", strength_value=8
+           → IGNORE the Su=45 kPa value for SS samples
+        2. "SS18 | Soft clay, field N = 6, N-corrected = 10"
+           → First column: SS18 → sample_type="SS" (HIGHEST PRIORITY)
+           → Use: strength_parameter="SPT-N", strength_value=6 (raw N)
+           → IGNORE N-corrected value
+        3. "SS-5 | Brown clay, split spoon test, N=12"
+           → First column: SS-5 → sample_type="SS" (HIGHEST PRIORITY)
+           → Use: strength_parameter="SPT-N", strength_value=12
+        **ST SAMPLE EXAMPLES (First Column Priority):**
+        1. "ST-5 | Stiff clay, Su = 85 kPa from unconfined compression"
+           → First column: ST-5 → sample_type="ST" (HIGHEST PRIORITY)
+           → Use: strength_parameter="Su", strength_value=85
+        2. "ST-12 | Medium clay, Su = 2.5 ksc from unconfined test"
+           → First column: ST-12 → sample_type="ST" (HIGHEST PRIORITY)
+           → Convert: 2.5 × 98 = 245 kPa
+           → Use: strength_parameter="Su", strength_value=245
+        3. "ST3 | Clay, unconfined strength = 3.0 t/m²"
+           → First column: ST3 → sample_type="ST" (HIGHEST PRIORITY)
+           → Convert: 3.0 × 9.81 = 29.43 kPa
+           → Use: strength_parameter="Su", strength_value=29.43
+        4. "ST-8 | Gray clay, shelby tube, Su = 120 kPa"
+           → First column: ST-8 → sample_type="ST" (HIGHEST PRIORITY)
+           → Use: strength_parameter="Su", strength_value=120
+        5. "ST-10 | Gray clay, depth 3.0-6.0m, Su values: 35, 42, 39 kPa"
+           → First column: ST-10 → sample_type="ST" (HIGHEST PRIORITY)
+           → Multiple values detected: variation <30% → Use average
+           → Use: strength_parameter="Su", strength_value=38.7
+           → Description: "Gray clay, shelby tube, Su averaged from 3 measurements: 35, 42, 39 kPa → 38.7 kPa"
+        6. "ST-15 | Stiff clay, Su measurements: 45, 85, 120 kPa at different depths"
+           → First column: ST-15 → sample_type="ST" (HIGHEST PRIORITY)
+           → High variation detected: ratio 2.7x → SUBDIVISION RECOMMENDED
+           → Use: strength_parameter="Su", strength_value=85 (middle value)
+           → Description: "Stiff clay, multiple Su values: 45, 85, 120 kPa [SUBDIVISION RECOMMENDED: High variation]"
+        **SOIL CLASSIFICATION EXAMPLES:**
+        1. "Brown silty clay, no sieve analysis data"
+           → soil_type="clay", sieve_200_passing=null
+           → Note: "Assumed clay - no sieve analysis data available"
+        2. "Sandy clay, sieve #200: 75% passing"
+           → soil_type="clay", sieve_200_passing=75
+           → Classification: Clay (>50% passing)
+        3. "Medium sand, gradation test shows 25% passing #200"
+           → soil_type="sand", sieve_200_passing=25
+           → Classification: Sand (<50% passing)
+        4. "Dense sand layer" (NO sieve data mentioned)
+           → soil_type="clay", sieve_200_passing=null
+           → Note: "Assumed clay - no sieve analysis data available"
+           → NEVER classify as sand without sieve data
+        TECHNICAL RULES:
+        1. All numeric values must be numbers, not strings
+        2. For soil_type, use basic terms: "clay", "sand", "silt", "gravel" - do NOT include consistency
+        3. Include sample_type field: "SS" (Split Spoon) or "ST" (Shelby Tube)
+        4. Include sieve_200_passing field when available (percentage passing sieve #200)
+        5. Include water_content field when available (percentage water content for clay consistency checks)
+        6. Include su_source field: "Unconfined Compression Test" for direct measurements, or "Calculated from SPT-N" for conversions
+        7. Strength parameters:
+           - SS samples: ALWAYS use "SPT-N" with RAW N-value (will be converted based on soil type)
+           - ST samples with clay: Use "Su" with DIRECT value in kPa from unconfined compression test
+           - For sand/gravel: Always use "SPT-N" with N-value
+           - NEVER use Su for SS samples, NEVER calculate Su from SPT-N for ST samples that have direct Su
+        8. Put consistency separately in "consistency" field: "soft", "medium", "stiff", "loose", "dense", etc.
+        9. Ensure continuous depths (no gaps or overlaps)
+        10. All depths in meters, strength values as numbers
+        11. Return ONLY the JSON object, no additional text"""
+    def _parse_llm_response(self, response: str) -> Dict[str, Any]:
+        """Parse LLM JSON response with enhanced error handling"""
+        # First check if response is empty or None
+        if not response or not response.strip():
+            return {"error": "Empty response from LLM", "raw_response": response or ""}
+        try:
+            # Clean response
+            json_str = response.strip()
+            # Log raw response for debugging (first 500 chars)
+            st.info(f"📝 Raw LLM response preview: {json_str[:500]}{'...' if len(json_str) > 500 else ''}")
+            # Remove markdown code blocks if present
+            if "```json" in json_str:
+                json_start = json_str.find("```json") + 7
+                json_end = json_str.find("```", json_start)
+                if json_end == -1:
+                    json_end = len(json_str)
+                json_str = json_str[json_start:json_end].strip()
+                st.info("🔧 Extracted JSON from markdown code block")
+            elif "```" in json_str:
+                json_start = json_str.find("```") + 3
+                json_end = json_str.rfind("```")
+                if json_end > json_start:
+                    json_str = json_str[json_start:json_end].strip()
+                    st.info("🔧 Extracted content from code block")
+            # Handle cases where LLM includes explanatory text before/after JSON
+            # Look for JSON object boundaries more aggressively
+            brace_start = json_str.find("{")
+            brace_end = json_str.rfind("}")
+            if brace_start != -1 and brace_end != -1 and brace_end > brace_start:
+                json_str = json_str[brace_start:brace_end + 1]
+                st.info(f"🔧 Extracted JSON object: {len(json_str)} characters")
+            elif not json_str.startswith("{"):
+                # No JSON found
+                return {
+                    "error": f"No JSON object found in response. Response appears to be: {json_str[:200]}",
+                    "raw_response": response
+                }
+            # Try to parse JSON
+            result = json.loads(json_str)
+            # Validate structure
+            if not isinstance(result, dict):
+                return {"error": f"Expected JSON object, got {type(result)}", "raw_response": response}
+            if "soil_layers" not in result:
+                result["soil_layers"] = []
+                st.warning("⚠️ No 'soil_layers' found in response, using empty list")
+            if "project_info" not in result:
+                result["project_info"] = {}
+                st.warning("⚠️ No 'project_info' found in response, using empty dict")
+            st.success(f"✅ JSON parsed successfully: {len(result.get('soil_layers', []))} layers found")
+            return result
+        except json.JSONDecodeError as e:
+            error_msg = f"JSON parsing failed: {str(e)}"
+            st.error(f"❌ {error_msg}")
+            st.error(f"📝 Problematic content: {json_str[:300] if 'json_str' in locals() else 'N/A'}")
+            return {"error": error_msg, "raw_response": response}
+        except Exception as e:
+            error_msg = f"Response parsing failed: {str(e)}"
+            st.error(f"❌ {error_msg}")
+            return {"error": error_msg, "raw_response": response}
+    def get_workflow_visualization(self) -> str:
+        """Get a visual representation of the workflow steps"""
+        return """
+        🚀 **Unified Soil Analysis Workflow** 🚀
+        **Step 1** 🔍 **Validate Inputs** → Check API key, content, model
+        **Step 2** 🤖 **Extract with LLM** → Use enhanced prompts for SS/ST classification
+        **Step 3** ✅ **Validate Extraction** → Check layer structure and data quality
+        **Step 4** 🧪 **Process SS/ST Classification** → Apply sample-specific processing
+        **Step 5** 🔧 **Apply Unit Conversions** → Convert all values to SI units (kPa)
+        **Step 6** 🎯 **Validate Soil Classification** → Enforce sieve analysis requirements
+        **Step 7** 📊 **Calculate Parameters** → Compute Su, φ, and other properties
+        **Step 8** ⚙️ **Optimize Layers** → Group and validate layer continuity
+        **Step 9** 📦 **Finalize Results** → Package complete analysis results
+        **Key Features:**
+        • **Unified Processing**: Single workflow handles all steps
+        • **SS/ST Classification**: Automatic sample type identification
+        • **Unit Conversion**: All Su values converted to kPa from images/text
+        • **Sieve Analysis Enforcement**: Sand layers require #200 sieve data
+        • **Error Handling**: Comprehensive validation and recovery
+        • **State Management**: Complete workflow state tracking
+        """
+    def analyze_soil_boring_log(self,
+                              text_content: Optional[str] = None,
+                              image_base64: Optional[str] = None,
+                              model: str = None,
+                              api_key: str = None,
+                              merge_similar: bool = True,
+                              split_thick: bool = True) -> Dict[str, Any]:
+        """
+        Run the unified soil analysis workflow
+        Args:
+            text_content: Extracted text from document
+            image_base64: Base64 encoded image
+            model: LLM model to use
+            api_key: OpenRouter API key
+            merge_similar: Whether to merge similar layers
+            split_thick: Whether to split thick layers
+        Returns:
+            Complete soil analysis results
+        """
+        # Initialize state
+        initial_state = SoilAnalysisState(
+            text_content=text_content,
+            image_base64=image_base64,
+            model=model or get_default_provider_and_model()[1],
+            api_key=api_key or "",
+            merge_similar=merge_similar,
+            split_thick=split_thick,
+            llm_extraction_success=False,
+            extraction_errors=[],
+            retry_count=0,  # Initialize retry counter
+            project_info={},
+            raw_soil_layers=[],
+            processed_layers=[],
+            water_table={},
+            notes="",
+            processing_summary={},
+            validation_stats={},
+            optimization_results={},
+            final_soil_data={},
+            workflow_status="initializing",
+            workflow_messages=[]
+        )
+        # Run workflow
+        st.info("🚀 Starting unified soil analysis workflow...")
+        try:
+            # Execute the workflow with recursion limit protection
+            final_state = self.workflow.invoke(
+                initial_state,
+                config={"recursion_limit": 50}  # Set explicit recursion limit
+            )
+            # Return results
+            if final_state["workflow_status"] == "completed":
+                st.success("🎉 Unified workflow completed successfully!")
+                return final_state["final_soil_data"]
+            else:
+                st.error(f"❌ Workflow failed with status: {final_state['workflow_status']}")
+                return final_state["final_soil_data"]
+        except Exception as e:
+            error_msg = str(e)
+            if "recursion limit" in error_msg.lower():
+                st.error("❌ Workflow execution failed: Recursion limit reached. This may indicate a configuration issue with the model or workflow logic.")
+                st.info("💡 Try using a different model or check your input data format.")
+            else:
+                st.error(f"❌ Workflow execution failed: {error_msg}")
+            return {
+                "error": f"Workflow execution failed: {error_msg}",
+                "workflow_status": "execution_failed"
+            }