|
--- |
|
title: EDAONSTERIOD |
|
emoji: π |
|
colorFrom: green |
|
colorTo: indigo |
|
sdk: gradio |
|
sdk_version: 5.34.1 |
|
app_file: app.py |
|
pinned: false |
|
short_description: Analytic |
|
--- |
|
# π₯ Odyssey: The AI Data Science Workspace |
|
|
|
|
|
π CognitiveEDA: The Adaptive Intelligence Engine |
|
 |
|
|
|
 |
|
|
|
 |
|
CognitiveEDA is not just another EDA tool; it's a world-class data discovery platform that intelligently adapts to your data. |
|
This enterprise-grade application goes beyond static profiling by automatically detecting the nature of your dataset (e.g., time-series, text-heavy) and unlocking specialized analysis modules on the fly. Powered by Google's Gemini LLM, it delivers a rich, context-aware, and deeply insightful user experience that transforms raw data into a clear narrative with actionable recommendations. |
|
(A GIF showcasing the adaptive UI revealing specialized tabs after data upload) |
|
|
|
β¨ Key Features: The "Wow" Factor |
|
|
|
CognitiveEDA is designed to impress data professionals by providing intelligent, context-aware analysis that feels magical. |
|
|
|
π§ Adaptive Analysis Modules: The UI isn't static. It intelligently detects your data's characteristics and dynamically reveals specialized tabs: |
|
|
|
β Time-Series Analysis: Automatically appears if date/time columns are found. Perform decomposition, check for stationarity (ADF Test), and visualize trends. |
|
|
|
π Text Analysis: Unlocks if long-form text columns are present. Instantly generate word clouds to visualize high-frequency terms. |
|
|
|
π§© Clustering (K-Means): Becomes available for datasets with strong numeric features, allowing you to discover latent groups and customer segments. |
|
|
|
π€ Hyper-Contextual AI Narrative: The integrated Gemini AI doesn't give a generic report. It receives context about the type of data it's analyzing, leading to far more specific and valuable insights (e.g., suggesting ARIMA for time-series or sentiment analysis for text). |
|
** Universal Data Ingestion:** Don't be limited to CSV. CognitiveEDA handles CSV and Excel files seamlessly. |
|
|
|
β‘ Performance-Aware: For massive datasets, the tool automatically samples the data for UI interactions to ensure a fast, responsive experience, while still using the full dataset for backend calculations where feasible. |
|
π Comprehensive Core EDA: All the essentials, done better: |
|
|
|
Detailed Data Profiling (Missing values, numeric stats, categorical stats). |
|
At-a-glance overview visuals (Data types, missing data heatmap, correlation matrix). |
|
Interactive deep-dive tools for exploring individual features. |
|
|
|
π οΈ Tech Stack |
|
This project leverages a modern, powerful stack for data science and web applications: |
|
Backend & Data Analysis: Python, Pandas, NumPy, scikit-learn, statsmodels |
|
Web Framework & UI: Gradio |
|
AI Integration: Google Generative AI (Gemini) |
|
Visualization: Plotly, Matplotlib, WordCloud |
|
|
|
π Getting Started |
|
You can get your own instance of CognitiveEDA running in just two steps. |
|
1. Prerequisites |
|
Python 3.9 or higher. |
|
A Google Gemini API Key. You can get a free key from Google AI Studio. |
|
2. Installation & Launch |
|
First, clone the repository to your local machine: |
|
Generated bash |
|
git clone https://github.com/your-repo/CognitiveEDA.git |
|
|
|
cd CognitiveEDA |
|
Use code with caution. |
|
Bash |
|
Next, install all the required dependencies using the requirements.txt file. It's highly recommended to do this within a Python virtual environment. |
|
Generated bash |
|
|
|
# Create and activate a virtual environment (optional but recommended) |
|
python -m venv venv |
|
source venv/bin/activate # On Windows, use `venv\Scripts\activate` |
|
|
|
# Install all dependencies |
|
pip install -r requirements.txt |
|
Use code with caution. |
|
Bash |
|
Finally, run the application: |
|
Generated bash |
|
python app.py |
|
Use code with caution. |
|
Bash |
|
The application will start and provide a local URL (e.g., http://127.0.0.1:7860) that you can open in your web browser. |
|
|
|
π How to Use |
|
Launch the application and open the URL in your browser. |
|
Upload your data file using the "Upload Data File" component. Supported formats are .csv, .xlsx, and .xls. |
|
Enter your Google Gemini API Key in the provided text field. |
|
Click "Build My Dashboard". |
|
|
|
Explore! The application will process your data and build a custom dashboard. The standard tabs (AI Narrative, Profile, Overview) will be populated, and any relevant specialized tabs (Time-Series, Text, Clustering) will automatically appear. |
|
Interact with the dropdowns and sliders in each tab to perform deep-dive analyses. |
|
π‘ Future Roadmap & Contributions |
|
CognitiveEDA is an evolving platform. We welcome contributions from the community! |
|
Potential Future Enhancements: |
|
|
|
Geospatial Analysis Module: Automatically detect latitude/longitude or location names and generate map-based visualizations. |
|
Interactive HTML Report Export: Export a single, beautiful, and fully interactive HTML file with embedded Plotly charts. |
|
Database Connectors: Allow users to connect directly to PostgreSQL, MySQL, or BigQuery. |
|
Background Job Processing: For extremely large datasets, allow full analysis to run as a background task with progress updates. |
|
Advanced Caching: Implement more sophisticated caching to speed up re-analysis of the same data. |
|
How to Contribute |
|
Fork the repository. |
|
|
|
Create a new branch for your feature (git checkout -b feature/AmazingNewFeature). |
|
Commit your changes (git commit -m 'Add some AmazingNewFeature'). |
|
Push to the branch (git push origin feature/AmazingNewFeature). |
|
Open a Pull Request. |
|
|
|
π License |
|
This project is licensed under the MIT License - see the LICENSE file for details. |