mgbam commited on
Commit
68a2453
·
verified ·
1 Parent(s): 0d6622c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -101
README.md CHANGED
@@ -4,113 +4,89 @@ emoji: 👀
4
  colorFrom: green
5
  colorTo: indigo
6
  sdk: gradio
7
- sdk_version: 5.34.0
8
  app_file: app.py
9
  pinned: false
10
  short_description: Analytic
11
  ---
12
  # 🔥 Odyssey: The AI Data Science Workspace
13
 
14
- ![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)
15
- ![Python Version](https://img.shields.io/badge/python-3.9+-indigo.svg)
16
- ![Status](https://img.shields.io/badge/status-beta-green.svg)
17
- ![Built with Gradio](https://img.shields.io/badge/Built%20with-Gradio-orange)
18
-
19
- Odyssey is not just an analytic tool; it's an AI-native, collaborative workspace designed to augment and accelerate the entire data science workflow. It moves beyond reactive profiling to a proactive, guided exploration experience, making you feel like you have a senior data scientist as your co-pilot.
20
-
21
-
22
- *(A conceptual image of the Odyssey UI)*
23
-
24
- ## Core Features
25
-
26
- Odyssey is built around four intelligent modules and a project-based workflow, providing a seamless journey from raw data to actionable insight.
27
-
28
- * 🔭 **Helios Overview**: A living, proactive dashboard that automatically runs upon data upload. It doesn't just show you stats; it surfaces critical insights like data quality issues, strong correlations, outlier alerts, and even suggests potential target variables for machine learning.
29
-
30
- * 🧪 **Asclepius Data Lab**: An interactive data preparation environment. Go beyond simple imputation with advanced methods like KNN for numeric data and smart categorical handling. See the impact of your changes instantly with live before-and-after visualizations.
31
-
32
- * 🚀 **Prometheus Launchpad**: A rapid machine learning modeling environment. Select a target and features, and with one click, train a model using robust 5-fold cross-validation. Instantly receive key performance metrics and advanced visualizations like ROC curves and residual plots to assess model viability.
33
-
34
- * 💡 **Athena Co-pilot**: A true AI collaborator. Athena understands the full context of your session—from the original data to the cleaned dataset and the models you've built. Ask it to perform complex analyses, generate plots, or even **build new, dynamic dashboards on the fly** right inside the chat.
35
-
36
- * 🗂️ **Project-Based Workflow**: Save your entire session—including cleaned data, chat history, and insights—into a single `.odyssey` file. Load projects later to pick up exactly where you left off.
37
-
38
- * 📄 **One-Click HTML Reports**: Generate a comprehensive, self-contained HTML report of your entire analysis, perfect for sharing with colleagues or stakeholders.
39
-
40
- ## 🚀 Getting Started
41
-
42
- Follow these steps to get Odyssey running on your local machine.
43
-
44
- ### Prerequisites
45
-
46
- * Python 3.9 or higher
47
- * `pip` package manager
48
- * `git` for cloning the repository
49
-
50
- ### 1. Clone the Repository
51
-
52
- Open your terminal and clone the project:
53
- ```bash
54
- git clone https://github.com/your-username/odyssey-ai-workspace.git
55
- cd odyssey-ai-workspace
56
- ```
57
-
58
- ### 2. Set Up a Virtual Environment
59
-
60
- It is highly recommended to use a virtual environment to manage dependencies and avoid conflicts.
61
-
62
- **On macOS/Linux:**
63
- ```bash
64
- python3 -m venv venv
65
- source venv/bin/activate
66
- ```
67
-
68
- **On Windows:**
69
- ```bash
70
  python -m venv venv
71
- .\venv\Scripts\activate
72
- ```
73
 
74
- ### 3. Install Dependencies
75
-
76
- Install all required packages using the `requirements.txt` file:
77
- ```bash
78
  pip install -r requirements.txt
79
- ```
80
-
81
- ### 4. Set Up Your API Key
82
-
83
- Odyssey's AI features are powered by the Google Gemini API.
84
-
85
- 1. Obtain a free API key from [Google AI Studio](https://aistudio.google.com/).
86
- 2. When you launch the application, you will see a field labeled "Gemini API Key". Paste your key there to activate the Athena Co-pilot and other AI features.
87
-
88
- ### 5. Run the Application
89
-
90
- Launch the Gradio application with the following command:
91
- ```bash
92
- python odyssey_app.py
93
- ```
94
- *(Assuming the main script is named `odyssey_app.py`)*
95
-
96
- Open your web browser and navigate to the local URL provided in the terminal (usually `http://127.0.0.1:7860`).
97
-
98
- ## 🧭 How to Use Odyssey
99
-
100
- 1. **Start a Project**: Give your project a name and upload a CSV file.
101
- 2. **Consult Helios**: Once uploaded, the **Helios Overview** will automatically populate with proactive insights. Review these findings to understand your data's strengths and weaknesses.
102
- 3. **Cleanse in the Lab**: Navigate to the **Asclepius Data Lab**. Use the dropdowns to select columns with missing data and apply imputation methods, previewing the effects in real-time.
103
- 4. **Launch a Model**: Go to the **Prometheus Launchpad**. Based on the suggestions from Helios, select a target variable and features. Choose a model and click "Launch" to see its predictive potential.
104
- 5. **Collaborate with Athena**: Open the **Athena Co-pilot**. Ask complex questions, request specific plots, or even ask it to build a custom dashboard (e.g., *"Build me a dashboard showing sales trends by region and product category."*).
105
- 6. **Save or Export**: Use the "Save" button to create a `.odyssey` file of your session, or click "Export Report" to generate a shareable HTML summary.
106
-
107
- ## 🤝 Contributing
108
-
109
- Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are **greatly appreciated**.
110
-
111
- Please feel free to submit a pull request or open an issue for any bugs, feature requests, or suggestions.
112
-
113
- ## 📝 License
114
-
115
- This project is licensed under the MIT License. See the `LICENSE` file for more details.
116
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
4
  colorFrom: green
5
  colorTo: indigo
6
  sdk: gradio
7
+ sdk_version: 5.34.1
8
  app_file: app.py
9
  pinned: false
10
  short_description: Analytic
11
  ---
12
  # 🔥 Odyssey: The AI Data Science Workspace
13
 
14
+ 🚀 CognitiveEDA: The Adaptive Intelligence Engine
15
+ ![alt text](https://img.shields.io/badge/version-4.0-blue.svg)
16
+
17
+ ![alt text](https://img.shields.io/badge/python-3.9+-indigo.svg)
18
+
19
+ ![alt text](https://img.shields.io/badge/license-MIT-green.svg)
20
+ CognitiveEDA is not just another EDA tool; it's a world-class data discovery platform that intelligently adapts to your data.
21
+ This enterprise-grade application goes beyond static profiling by automatically detecting the nature of your dataset (e.g., time-series, text-heavy) and unlocking specialized analysis modules on the fly. Powered by Google's Gemini LLM, it delivers a rich, context-aware, and deeply insightful user experience that transforms raw data into a clear narrative with actionable recommendations.
22
+ (A GIF showcasing the adaptive UI revealing specialized tabs after data upload)
23
+ ✨ Key Features: The "Wow" Factor
24
+ CognitiveEDA is designed to impress data professionals by providing intelligent, context-aware analysis that feels magical.
25
+ 🧠 Adaptive Analysis Modules: The UI isn't static. It intelligently detects your data's characteristics and dynamically reveals specialized tabs:
26
+ Time-Series Analysis: Automatically appears if date/time columns are found. Perform decomposition, check for stationarity (ADF Test), and visualize trends.
27
+ 📝 Text Analysis: Unlocks if long-form text columns are present. Instantly generate word clouds to visualize high-frequency terms.
28
+ 🧩 Clustering (K-Means): Becomes available for datasets with strong numeric features, allowing you to discover latent groups and customer segments.
29
+ 🤖 Hyper-Contextual AI Narrative: The integrated Gemini AI doesn't give a generic report. It receives context about the type of data it's analyzing, leading to far more specific and valuable insights (e.g., suggesting ARIMA for time-series or sentiment analysis for text).
30
+ ** Universal Data Ingestion:** Don't be limited to CSV. CognitiveEDA handles CSV and Excel files seamlessly.
31
+ ⚡ Performance-Aware: For massive datasets, the tool automatically samples the data for UI interactions to ensure a fast, responsive experience, while still using the full dataset for backend calculations where feasible.
32
+ 📊 Comprehensive Core EDA: All the essentials, done better:
33
+ Detailed Data Profiling (Missing values, numeric stats, categorical stats).
34
+ At-a-glance overview visuals (Data types, missing data heatmap, correlation matrix).
35
+ Interactive deep-dive tools for exploring individual features.
36
+ 🛠️ Tech Stack
37
+ This project leverages a modern, powerful stack for data science and web applications:
38
+ Backend & Data Analysis: Python, Pandas, NumPy, scikit-learn, statsmodels
39
+ Web Framework & UI: Gradio
40
+ AI Integration: Google Generative AI (Gemini)
41
+ Visualization: Plotly, Matplotlib, WordCloud
42
+ 🚀 Getting Started
43
+ You can get your own instance of CognitiveEDA running in just two steps.
44
+ 1. Prerequisites
45
+ Python 3.9 or higher.
46
+ A Google Gemini API Key. You can get a free key from Google AI Studio.
47
+ 2. Installation & Launch
48
+ First, clone the repository to your local machine:
49
+ Generated bash
50
+ git clone https://github.com/your-repo/CognitiveEDA.git
51
+ cd CognitiveEDA
52
+ Use code with caution.
53
+ Bash
54
+ Next, install all the required dependencies using the requirements.txt file. It's highly recommended to do this within a Python virtual environment.
55
+ Generated bash
56
+ # Create and activate a virtual environment (optional but recommended)
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  python -m venv venv
58
+ source venv/bin/activate # On Windows, use `venv\Scripts\activate`
 
59
 
60
+ # Install all dependencies
 
 
 
61
  pip install -r requirements.txt
62
+ Use code with caution.
63
+ Bash
64
+ Finally, run the application:
65
+ Generated bash
66
+ python app.py
67
+ Use code with caution.
68
+ Bash
69
+ The application will start and provide a local URL (e.g., http://127.0.0.1:7860) that you can open in your web browser.
70
+ 📖 How to Use
71
+ Launch the application and open the URL in your browser.
72
+ Upload your data file using the "Upload Data File" component. Supported formats are .csv, .xlsx, and .xls.
73
+ Enter your Google Gemini API Key in the provided text field.
74
+ Click "Build My Dashboard".
75
+ Explore! The application will process your data and build a custom dashboard. The standard tabs (AI Narrative, Profile, Overview) will be populated, and any relevant specialized tabs (Time-Series, Text, Clustering) will automatically appear.
76
+ Interact with the dropdowns and sliders in each tab to perform deep-dive analyses.
77
+ 💡 Future Roadmap & Contributions
78
+ CognitiveEDA is an evolving platform. We welcome contributions from the community!
79
+ Potential Future Enhancements:
80
+ Geospatial Analysis Module: Automatically detect latitude/longitude or location names and generate map-based visualizations.
81
+ Interactive HTML Report Export: Export a single, beautiful, and fully interactive HTML file with embedded Plotly charts.
82
+ Database Connectors: Allow users to connect directly to PostgreSQL, MySQL, or BigQuery.
83
+ Background Job Processing: For extremely large datasets, allow full analysis to run as a background task with progress updates.
84
+ Advanced Caching: Implement more sophisticated caching to speed up re-analysis of the same data.
85
+ How to Contribute
86
+ Fork the repository.
87
+ Create a new branch for your feature (git checkout -b feature/AmazingNewFeature).
88
+ Commit your changes (git commit -m 'Add some AmazingNewFeature').
89
+ Push to the branch (git push origin feature/AmazingNewFeature).
90
+ Open a Pull Request.
91
+ 📄 License
92
+ This project is licensed under the MIT License - see the LICENSE file for details.