File size: 3,977 Bytes
1c82f95
3c44515
7512014
1c82f95
 
 
3341ae8
1c82f95
 
 
f0f7dee
0842de0
 
 
1c8596f
f871a4a
a6aadaa
2f1ac21
67b67ae
524a321
cbfbfd6
711de5c
cf08949
a36ddef
 
 
 
 
 
 
0842de0
 
 
90610ad
0842de0
 
 
 
b8c450c
0842de0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7958ed1
0842de0
 
 
 
 
 
 
7958ed1
0842de0
7958ed1
0842de0
 
7958ed1
 
0842de0
 
 
 
 
7958ed1
0842de0
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
title: T5S
emoji: πŸ’―
colorFrom: yellow
colorTo: red
sdk: streamlit
app_file: src/visualization/visualize.py
pinned: false
---

<h1 align="center">t5s</h1>

T5 Summarisation Using Pytorch Lightning

[![pypi Version](https://img.shields.io/pypi/v/t5s.svg?logo=pypi&logoColor=white)](https://pypi.org/project/5ts/)
[![Downloads](https://static.pepy.tech/personalized-badge/t5s?period=total&units=none&left_color=grey&right_color=orange&left_text=Pip%20Downloads)](https://pepy.tech/project/t5s)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://huggingface.co/spaces/gagan3012/summarization)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/gagan3012/summarization/blob/master/notebooks/t5s.ipynb)

## Usage

To use and run the DVC pipeline install the `t5s` package

```shell script
pip install t5s
```



Instructions
------------
1. Clone the repo.
1. Edit the `params.yml` to change the parameters to train the model.
1. Run `make dirs` to create the missing parts of the directory structure described below. 
1. *Optional:* Run `make virtualenv` to create a python virtual environment. Skip if using conda or some other env manager.
    1. Run `source env/bin/activate` to activate the virtualenv. 
1. Run `make requirements` to install required python packages.
1. Process your data, train and evaluate your model using `make run`
1. When you're happy with the result, commit files (including .dvc files) to git.
 
Project Organization
------------

    β”œβ”€β”€ LICENSE
    β”œβ”€β”€ Makefile           <- Makefile with commands like `make dirs` or `make clean`
    β”œβ”€β”€ README.md          <- The top-level README for developers using this project.
    β”œβ”€β”€ data
    β”‚Β Β  β”œβ”€β”€ processed      <- The final, canonical data sets for modeling.
    β”‚Β Β  └── raw            <- The original, immutable data dump.
    β”‚
    β”œβ”€β”€ models             <- Trained and serialized models, model predictions, or model summaries
    β”‚
    β”œβ”€β”€ notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
    β”‚                         the creator's initials, and a short `-` delimited description, e.g.
    β”‚                         `1.0-jqp-initial-data-exploration`.
    β”œβ”€β”€ references         <- Data dictionaries, manuals, and all other explanatory materials.
    β”‚
    β”œβ”€β”€ reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
    β”‚Β Β  └── metrics.txt    <- Relevant metrics after evaluating the model.
    β”‚Β Β  └── training_metrics.txt    <- Relevant metrics from training the model.
    β”‚
    β”œβ”€β”€ requirements.txt   <- The requirements file for reproducing the analysis environment
    β”‚
    β”œβ”€β”€ setup.py           <- makes project pip installable (pip install -e .) so src can be imported
    β”œβ”€β”€ src                <- Source code for use in this project.
    β”‚Β Β  β”œβ”€β”€ __init__.py    <- Makes src a Python module
    β”‚   β”‚
    β”‚Β Β  β”œβ”€β”€ data           <- Scripts to download or generate data
    β”‚Β Β  β”‚Β Β  └── make_dataset.py
    β”‚Β Β  β”‚Β Β  └── process_data.py
    β”‚   β”‚
    β”‚Β Β  β”œβ”€β”€ models         <- Scripts to train models 
    β”‚Β Β  β”‚Β Β  β”œβ”€β”€ predict_model.py
    β”‚Β Β  β”‚Β Β  └── train_model.py
    β”‚Β Β  β”‚Β Β  └── evaluate_model.py
    β”‚Β Β  β”‚Β Β  └── model.py
    β”‚   β”‚
    β”‚Β Β  └── visualization  <- Scripts to create exploratory and results oriented visualizations
    β”‚Β Β      └── visualize.py
    β”‚
    β”œβ”€β”€ tox.ini            <- tox file with settings for running tox; see tox.testrun.org
    └── data.dvc          <- Traing a model on the processed data.


--------