Spaces:
Running
Running
File size: 8,343 Bytes
9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 4f91159 9a76340 4f91159 9a76340 4f91159 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 f7e0aa2 9a76340 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 |
# Streamlit Project Guideline
## 1. Framework Overview
Streamlit is an open-source Python library that makes it easy to create and share custom web apps for machine learning and data science. It allows developers to build web applications quickly by writing pure Python scripts. Streamlit's key features include its simplicity, ease of use, and the ability to seamlessly integrate with other data science tools and libraries.
## 2. Component Structure
### Component Types
In the provided project, the components are primarily function-based, focusing on data processing, generation, and UI interactions. The main types of components identified are:
- **Data Processing Functions**: Functions like `process_json`, `generate_description_only`, and `analyze_input` handle data processing tasks.
- **UI Interaction Functions**: Functions like `example_directly_selected`, `example_from_briefs_selected`, and `example_selected` manage user interactions within the Streamlit UI.
- **Session State Management**: Functions and blocks that handle session state, such as initializing session state variables and updating them based on user actions.
### Example Components
#### Data Processing Function
```python
def process_json(input_json, model_name, generating_batch_size, temperature):
try:
model = ChatOpenAI(
model=model_name, temperature=temperature, max_retries=3)
generator = TaskDescriptionGenerator(model)
result = generator.process(input_json, generating_batch_size)
# Processing logic...
return description, suggestions, examples_directly, input_analysis, new_example_briefs, examples_from_briefs, examples
except Exception as e:
st.warning(f"An error occurred: {str(e)}. Returning default values.")
return "", [], [], "", [], [], []
```
#### UI Interaction Function
```python
def example_directly_selected():
if 'selected_example_directly_id' in st.session_state:
try:
selected_example_ids = st.session_state.selected_example_directly_id[
'selection']['rows']
# Interaction logic...
except Exception as e:
st.session_state.selected_example = None
```
#### Session State Management
```python
if 'input_data' not in st.session_state:
st.session_state.input_data = pd.DataFrame(columns=["Input", "Output"])
if 'description_output_text' not in st.session_state:
st.session_state.description_output_text = ''
```
## 3. UI Operations and Callbacks
### Common UI Operations
- **Button Clicks**: Handling button clicks to trigger data processing or state updates.
- **Data Editing**: Allowing users to edit data tables directly within the UI.
- **File Uploads and Downloads**: Managing file uploads for importing data and file downloads for exporting data.
### Example UI Operations
#### Button Click Handling
```python
submit_button = st.button(
"Generate", type="primary", on_click=generate_examples_dataframe)
```
#### Data Editing
```python
input_data = st.data_editor(
st.session_state.input_data,
num_rows="dynamic",
use_container_width=True,
column_config={
"Input": st.column_config.TextColumn("Input", width="large"),
"Output": st.column_config.TextColumn("Output", width="large"),
},
)
```
#### File Upload and Download
```python
input_file = st.file_uploader(
label="Import Input Data from JSON",
type="json",
key="input_file",
on_change=import_input_data_from_json
)
export_button = st.button(
"Export Input Data to JSON", on_click=export_input_data_to_json
)
```
## 4. State Management
### State Management Approach
Streamlit uses a session state to manage the state of the application across reruns. The session state allows developers to persist variables across reruns, enabling more complex interactions and workflows.
### Example State Management
#### Initializing Session State
```python
if 'input_data' not in st.session_state:
st.session_state.input_data = pd.DataFrame(columns=["Input", "Output"])
```
#### Updating Session State
```python
def update_description_output_text():
input_json = package_input_data()
result = generate_description_only(input_json, model_name, temperature)
st.session_state.description_output_text = result[0]
st.session_state.suggestions = result[1]
```
## 5. Routing
Streamlit does not support traditional client-side routing like other frontend frameworks. Instead, it focuses on creating single-page applications where the URL does not change. Navigation within a Streamlit app is typically handled through the sidebar or buttons that trigger reruns of the script.
## 6. Data Fetching
### Data Fetching Methods
Data fetching in Streamlit is often done through direct API calls within the script. The fetched data is then processed and displayed within the app.
### Example Data Fetching
```python
def process_json(input_json, model_name, generating_batch_size, temperature):
try:
model = ChatOpenAI(
model=model_name, temperature=temperature, max_retries=3)
generator = TaskDescriptionGenerator(model)
result = generator.process(input_json, generating_batch_size)
# Data processing logic...
return description, suggestions, examples_directly, input_analysis, new_example_briefs, examples_from_briefs, examples
except Exception as e:
st.warning(f"An error occurred: {str(e)}. Returning default values.")
return "", [], [], "", [], [], []
```
## 7. Styling
### Styling Approaches
Streamlit provides basic styling options through its API, such as `st.markdown` for custom HTML and CSS, and `st.sidebar` for organizing content. For more advanced styling, custom CSS can be injected using `st.markdown` with HTML tags.
### Example Styling
```python
st.title("LLM Task Example Generator")
st.markdown("Enter input-output pairs in the table below to generate a task description, analysis, and additional examples.")
```
## 8. Performance Optimization
### Optimization Techniques
- **Code Splitting**: Not applicable in Streamlit as it is a single-page application framework.
- **Lazy Loading**: Not directly supported; however, conditional rendering can be used to load components only when needed.
- **Memoization**: Use Streamlit's `@st.cache` decorator to cache expensive computations.
### Example Optimization
```python
@st.cache
def process_json(input_json, model_name, generating_batch_size, temperature):
# Expensive computation...
return result
```
## 9. Testing
### Testing Methodologies
Streamlit applications can be tested using traditional Python testing frameworks like `unittest` and `pytest`. Integration and end-to-end tests can be challenging due to the nature of Streamlit's rerun mechanism.
### Example Testing
```python
import unittest
from your_streamlit_app import process_json
class TestProcessJson(unittest.TestCase):
def test_process_json(self):
input_json = '{"key": "value"}'
result = process_json(input_json, "model_name", 3, 0.5)
self.assertEqual(result[0], "expected_description")
if __name__ == "__main__":
unittest.main()
```
## 10. Best Practices and Common Pitfalls
### Best Practices
- **Modular Code**: Organize code into reusable functions and modules.
- **Session State Management**: Use session state effectively to manage application state.
- **Error Handling**: Implement robust error handling to provide a smooth user experience.
- **Performance Optimization**: Use caching and efficient data handling to optimize performance.
### Common Pitfalls
- **Overuse of Reruns**: Avoid triggering unnecessary reruns, which can degrade performance.
- **Complex State Management**: Be cautious with complex state management, as it can lead to bugs and unexpected behavior.
- **Lack of Testing**: Neglecting testing can lead to issues that are hard to debug in a rapidly changing environment.
## Conclusion
This guideline provides a comprehensive overview of using Streamlit within a project, covering component structure, UI operations, state management, data fetching, styling, performance optimization, testing, and best practices. By following these guidelines, developers can create efficient, maintainable, and user-friendly Streamlit applications. |