Update README.md
Browse files
README.md
CHANGED
@@ -4,113 +4,89 @@ emoji: 👀
|
|
4 |
colorFrom: green
|
5 |
colorTo: indigo
|
6 |
sdk: gradio
|
7 |
-
sdk_version: 5.34.
|
8 |
app_file: app.py
|
9 |
pinned: false
|
10 |
short_description: Analytic
|
11 |
---
|
12 |
# 🔥 Odyssey: The AI Data Science Workspace
|
13 |
|
14 |
-
|
15 |
-

|
16 |
+
|
17 |
+

|
18 |
+
|
19 |
+

|
20 |
+
CognitiveEDA is not just another EDA tool; it's a world-class data discovery platform that intelligently adapts to your data.
|
21 |
+
This enterprise-grade application goes beyond static profiling by automatically detecting the nature of your dataset (e.g., time-series, text-heavy) and unlocking specialized analysis modules on the fly. Powered by Google's Gemini LLM, it delivers a rich, context-aware, and deeply insightful user experience that transforms raw data into a clear narrative with actionable recommendations.
|
22 |
+
(A GIF showcasing the adaptive UI revealing specialized tabs after data upload)
|
23 |
+
✨ Key Features: The "Wow" Factor
|
24 |
+
CognitiveEDA is designed to impress data professionals by providing intelligent, context-aware analysis that feels magical.
|
25 |
+
🧠 Adaptive Analysis Modules: The UI isn't static. It intelligently detects your data's characteristics and dynamically reveals specialized tabs:
|
26 |
+
⌛ Time-Series Analysis: Automatically appears if date/time columns are found. Perform decomposition, check for stationarity (ADF Test), and visualize trends.
|
27 |
+
📝 Text Analysis: Unlocks if long-form text columns are present. Instantly generate word clouds to visualize high-frequency terms.
|
28 |
+
🧩 Clustering (K-Means): Becomes available for datasets with strong numeric features, allowing you to discover latent groups and customer segments.
|
29 |
+
🤖 Hyper-Contextual AI Narrative: The integrated Gemini AI doesn't give a generic report. It receives context about the type of data it's analyzing, leading to far more specific and valuable insights (e.g., suggesting ARIMA for time-series or sentiment analysis for text).
|
30 |
+
** Universal Data Ingestion:** Don't be limited to CSV. CognitiveEDA handles CSV and Excel files seamlessly.
|
31 |
+
⚡ Performance-Aware: For massive datasets, the tool automatically samples the data for UI interactions to ensure a fast, responsive experience, while still using the full dataset for backend calculations where feasible.
|
32 |
+
📊 Comprehensive Core EDA: All the essentials, done better:
|
33 |
+
Detailed Data Profiling (Missing values, numeric stats, categorical stats).
|
34 |
+
At-a-glance overview visuals (Data types, missing data heatmap, correlation matrix).
|
35 |
+
Interactive deep-dive tools for exploring individual features.
|
36 |
+
🛠️ Tech Stack
|
37 |
+
This project leverages a modern, powerful stack for data science and web applications:
|
38 |
+
Backend & Data Analysis: Python, Pandas, NumPy, scikit-learn, statsmodels
|
39 |
+
Web Framework & UI: Gradio
|
40 |
+
AI Integration: Google Generative AI (Gemini)
|
41 |
+
Visualization: Plotly, Matplotlib, WordCloud
|
42 |
+
🚀 Getting Started
|
43 |
+
You can get your own instance of CognitiveEDA running in just two steps.
|
44 |
+
1. Prerequisites
|
45 |
+
Python 3.9 or higher.
|
46 |
+
A Google Gemini API Key. You can get a free key from Google AI Studio.
|
47 |
+
2. Installation & Launch
|
48 |
+
First, clone the repository to your local machine:
|
49 |
+
Generated bash
|
50 |
+
git clone https://github.com/your-repo/CognitiveEDA.git
|
51 |
+
cd CognitiveEDA
|
52 |
+
Use code with caution.
|
53 |
+
Bash
|
54 |
+
Next, install all the required dependencies using the requirements.txt file. It's highly recommended to do this within a Python virtual environment.
|
55 |
+
Generated bash
|
56 |
+
# Create and activate a virtual environment (optional but recommended)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
57 |
python -m venv venv
|
58 |
+
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
|
|
|
59 |
|
60 |
+
# Install all dependencies
|
|
|
|
|
|
|
61 |
pip install -r requirements.txt
|
62 |
+
Use code with caution.
|
63 |
+
Bash
|
64 |
+
Finally, run the application:
|
65 |
+
Generated bash
|
66 |
+
python app.py
|
67 |
+
Use code with caution.
|
68 |
+
Bash
|
69 |
+
The application will start and provide a local URL (e.g., http://127.0.0.1:7860) that you can open in your web browser.
|
70 |
+
📖 How to Use
|
71 |
+
Launch the application and open the URL in your browser.
|
72 |
+
Upload your data file using the "Upload Data File" component. Supported formats are .csv, .xlsx, and .xls.
|
73 |
+
Enter your Google Gemini API Key in the provided text field.
|
74 |
+
Click "Build My Dashboard".
|
75 |
+
Explore! The application will process your data and build a custom dashboard. The standard tabs (AI Narrative, Profile, Overview) will be populated, and any relevant specialized tabs (Time-Series, Text, Clustering) will automatically appear.
|
76 |
+
Interact with the dropdowns and sliders in each tab to perform deep-dive analyses.
|
77 |
+
💡 Future Roadmap & Contributions
|
78 |
+
CognitiveEDA is an evolving platform. We welcome contributions from the community!
|
79 |
+
Potential Future Enhancements:
|
80 |
+
Geospatial Analysis Module: Automatically detect latitude/longitude or location names and generate map-based visualizations.
|
81 |
+
Interactive HTML Report Export: Export a single, beautiful, and fully interactive HTML file with embedded Plotly charts.
|
82 |
+
Database Connectors: Allow users to connect directly to PostgreSQL, MySQL, or BigQuery.
|
83 |
+
Background Job Processing: For extremely large datasets, allow full analysis to run as a background task with progress updates.
|
84 |
+
Advanced Caching: Implement more sophisticated caching to speed up re-analysis of the same data.
|
85 |
+
How to Contribute
|
86 |
+
Fork the repository.
|
87 |
+
Create a new branch for your feature (git checkout -b feature/AmazingNewFeature).
|
88 |
+
Commit your changes (git commit -m 'Add some AmazingNewFeature').
|
89 |
+
Push to the branch (git push origin feature/AmazingNewFeature).
|
90 |
+
Open a Pull Request.
|
91 |
+
📄 License
|
92 |
+
This project is licensed under the MIT License - see the LICENSE file for details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|