ibm-granite
/

granite-timeseries-patchtst

@@ -11,16 +11,15 @@ model-index:
 <!-- Provide a quick summary of what the model is/does. -->
-[`PatchTST`](https://huggingface.co/docs/transformers/model_doc/patchtst) is a transformer-based model for time series modeling tasks, including forecasting, regression, and classification.
-In this context, we offer a pre-trained `PatchTST` model encompassing all seven channels of the `ETTh1` dataset.
 This particular pre-trained model produces a Mean Squared Error (MSE) of 0.3881 on the `test` split of the `ETTh1` dataset when forecasting 96 hours into the future with a historical data window of 512 hours.
-For training and evaluating a `PatchTST` model, you can refer to [this notebook](https://github.com/IBM/tsfm/blob/main/notebooks/hfdemo/patch_tst_getting_started.ipynb).
 ## Model Details
 The `PatchTST` model was proposed in A Time Series is Worth [64 Words: Long-term Forecasting with Transformers](https://arxiv.org/abs/2211.14730) by Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, Jayant Kalagnanam.
 At a high level the model vectorizes time series into patches of a given size and encodes the resulting sequence of vectors via a Transformer that then outputs the prediction length forecast via an appropriate head.
@@ -29,12 +28,6 @@ The model is based on two key components: (i) segmentation of time series into s
 In addition, PatchTST has a modular design to seamlessly support masked time series pre-training as well as direct time series forecasting, classification, and regression.
-### Model Description
-<!-- Provide a longer summary of what this model is. -->
 <img src="patchtst_architecture.png" alt="Architecture" width="600" />
 ### Model Sources

 <!-- Provide a quick summary of what the model is/does. -->
+[`PatchTST`](https://huggingface.co/docs/transformers/model_doc/patchtst) is a transformer-based model for time series modeling tasks, including forecasting, regression, and classification. This repository contains a pre-trained `PatchTST` model encompassing all seven channels of the `ETTh1` dataset.
 This particular pre-trained model produces a Mean Squared Error (MSE) of 0.3881 on the `test` split of the `ETTh1` dataset when forecasting 96 hours into the future with a historical data window of 512 hours.
+For training and evaluating a `PatchTST` model, you can refer to this [demo notebook](https://github.com/IBM/tsfm/blob/main/notebooks/hfdemo/patch_tst_getting_started.ipynb).
 ## Model Details
+### Model Description
 The `PatchTST` model was proposed in A Time Series is Worth [64 Words: Long-term Forecasting with Transformers](https://arxiv.org/abs/2211.14730) by Yuqi Nie, Nam H. Nguyen, Phanwadee Sinthong, Jayant Kalagnanam.
 At a high level the model vectorizes time series into patches of a given size and encodes the resulting sequence of vectors via a Transformer that then outputs the prediction length forecast via an appropriate head.
 In addition, PatchTST has a modular design to seamlessly support masked time series pre-training as well as direct time series forecasting, classification, and regression.
 <img src="patchtst_architecture.png" alt="Architecture" width="600" />
 ### Model Sources