tags: | |
- spacy | |
- token-classification | |
language: | |
- en | |
model-index: | |
- name: en_spacy_pii_distilbert | |
results: | |
- task: | |
name: NER | |
type: token-classification | |
metrics: | |
- name: NER Precision | |
type: precision | |
value: 0.9530385872 | |
- name: NER Recall | |
type: recall | |
value: 0.9554103008 | |
- name: NER F Score | |
type: f_score | |
value: 0.9542229703 | |
widget: | |
- text: >- | |
SELECT shipping FROM users WHERE shipping = '201 Thayer St Providence RI | |
02912' | |
datasets: | |
- beki/privy | |
| Feature | Description | | |
| --- | --- | | |
| **Name** | `en_spacy_pii_distilbert` | | |
| **Version** | `0.0.0` | | |
| **spaCy** | `>=3.4.1,<3.5.0` | | |
| **Default Pipeline** | `transformer`, `ner` | | |
| **Components** | `transformer`, `ner` | | |
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) | | |
| **Sources** | Trained on a new [dataset for structured PII](https://huggingface.co/datasets/beki/privy) generated by [Privy](https://github.com/pixie-io/pixie/tree/main/src/datagen/pii/privy) | | |
| **License** | MIT | | |
| **Author** | [Benjamin Kilimnik](https://www.linkedin.com/in/benkilimnik/) | | |
### Label Scheme | |
<details> | |
<summary>View label scheme (5 labels for 1 components)</summary> | |
| Component | Labels | | |
| --- | --- | | |
| **`ner`** | `DATE_TIME`, `LOC`, `NRP`, `ORG`, `PER` | | |
</details> | |
### Accuracy | |
| Type | Score | | |
| --- | --- | | |
| `ENTS_F` | 95.42 | | |
| `ENTS_P` | 95.30 | | |
| `ENTS_R` | 95.54 | | |
| `TRANSFORMER_LOSS` | 61154.85 | | |
| `NER_LOSS` | 56001.88 | |