File size: 3,874 Bytes
f5407b6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
# UnivariateAnalysis

> **Note:** The following examples assume a time series DataFrame similar to `complaints.csv`, with columns: `date`and `complaints`.

The `UnivariateAnalysis` class provides a suite of methods for exploratory and statistical analysis of univariate time series data. It helps you understand the distribution, missing values, and outliers in your time series before further modeling or forecasting.

## Features

- Visualizes the distribution and boxplot of the target time series.
- Computes skewness and kurtosis with interpretation.
- Checks for missing values and provides recommendations.
- Detects outliers using IQR and Z-score methods.
- Logs plots and messages to HTML reports.

## Class: `UnivariateAnalysis`

### Initialization

```python

UnivariateAnalysis(df: pd.DataFrame, target_col: str, index_col: str = "date", output_filepath: str = "output_filepath")

```

- **df**: The time series DataFrame (indexed by the time column).
- **target_col**: The column name of the univariate time series to analyze.

- **index_col**: The name of the time index column (default: "date").
- **output_filepath**: Path prefix for saving HTML reports and plots.



> **Note:** Your DataFrame should have a time-based index (e.g., "date", "timestamp").



### Methods



#### `plot_distribution()`



Plots the histogram and boxplot of the target time series column and logs the plot to the HTML report.



**Standalone Example:**
```python

from dynamicts.analysis import UnivariateAnalysis



analysis = UnivariateAnalysis(df, target_col="complaints", index_col="date", output_filepath="report")

fig = analysis.plot_distribution()

fig.show()

```

#### `check_distribution_stats()`

Computes skewness and kurtosis for the target column, interprets the results, and logs the summary to the HTML report.

**Standalone Example:**
```python

from dynamicts.analysis import UnivariateAnalysis



analysis = UnivariateAnalysis(df, target_col="complaints", index_col="date", output_filepath="report")

stats = analysis.check_distribution_stats()

print(stats["full_message"])

```

#### `check_missing_values()`

Checks for missing values in the target column, reports the count and percentage, and logs recommendations to the HTML report.

**Standalone Example:**
```python

from dynamicts.analysis import UnivariateAnalysis



analysis = UnivariateAnalysis(df, target_col="complaints", index_col="date", output_filepath="report")

missing = analysis.check_missing_values()

print(missing["message"])

```

#### `detect_outliers(method="both", plot=True)`



Detects outliers in the target column using IQR, Z-score, or both. Optionally plots and logs the results.



- **method**: "iqr", "zscore", or "both" (default: "both").

- **plot**: Whether to plot the outliers (default: True).



**Standalone Example:**

```python

from dynamicts.analysis import UnivariateAnalysis



analysis = UnivariateAnalysis(df, target_col="complaints", index_col="date", output_filepath="report")
outliers = analysis.detect_outliers(method="both", plot=True)

print(f"Outliers detected: {outliers['outliers_detected']}")
```



#### `run_univariate_analysis(df, output_filepath, target_col, index_col="date")` (static method)



Runs the full univariate analysis pipeline: distribution plot, stats, missing values, and outlier detection. Displays results in a notebook environment.



**Standalone Example:**

```python

from dynamicts.analysis import UnivariateAnalysis



results = UnivariateAnalysis.run_univariate_analysis(

    df=df,

    output_filepath="report",

    target_col="complaints",

    index_col="date"

)

```

### Notes

- All plots and messages are logged to HTML reports using the provided `output_filepath`.
- The DataFrame should be indexed by the time column for proper time series analysis.

---