updated readme
Browse files
README.md
CHANGED
@@ -58,14 +58,18 @@ widget:
|
|
58 |
example_title: "gast and dust"
|
59 |
---
|
60 |
|
61 |
-
# astroBERT
|
62 |
-
|
63 |
-
|
64 |
-
This model is cased (it treats `ads` and `ADS` differently).
|
65 |
|
66 |
-
##
|
67 |
-
0. [
|
68 |
-
1. [
|
|
|
|
|
|
|
|
|
|
|
69 |
|
70 |
|
71 |
### BibTeX
|
|
|
58 |
example_title: "gast and dust"
|
59 |
---
|
60 |
|
61 |
+
# ***astroBERT: a language model for astrophysics***
|
62 |
+
This public repository contains the work of the [NASA/ADS](https://ui.adsabs.harvard.edu/) on building an NLP language model tailored to astrophysics, along with tutorials and miscellaneous related files.
|
63 |
+
This model is **cased** (it treats `ads` and `ADS` differently).
|
|
|
64 |
|
65 |
+
## astroBERT models
|
66 |
+
0. **Base model**: Pretrained model on English language using a masked language modeling (MLM) and next sentence prediction (NSP) objective. It was introduced in [this paper at ADASS 2021](https://arxiv.org/abs/2112.00590) and made public at ADASS 2022.
|
67 |
+
1. **NER-DEAL model**: This model adds a token classification head to the base model finetuned on the [DEAL@WIESP2022 named entity recognition](https://ui.adsabs.harvard.edu/WIESP/2022/SharedTasks) task. Must be loaded from the `revision='NER-DEAL'` branch (see tutorial 2).
|
68 |
+
|
69 |
+
### Tutorials
|
70 |
+
0. [generate text embedding (for downstream tasks)](https://nbviewer.org/urls/huggingface.co/adsabs/astroBERT/raw/main/Tutorials/0_Embeddings.ipynb)
|
71 |
+
1. [use astroBERT for the Fill-Mask task](https://nbviewer.org/urls/huggingface.co/adsabs/astroBERT/raw/main/Tutorials/1_Fill-Mask.ipynb)
|
72 |
+
2. [make NER-DEAL predictions](https://nbviewer.org/urls/huggingface.co/adsabs/astroBERT/raw/main/Tutorials/2_NER_DEAL.ipynb)
|
73 |
|
74 |
|
75 |
### BibTeX
|