fleonce commited on
Commit
6bdab5f
1 Parent(s): 31780b1

Upload ITERForRelationExtraction

Browse files
Files changed (4) hide show
  1. README.md +104 -0
  2. config.json +76 -0
  3. generation_config.json +5 -0
  4. model.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - microsoft/deberta-v3-large
5
+ library_name: transformers
6
+ tags:
7
+ - relation extraction
8
+ - nlp
9
+ model-index:
10
+ - name: iter-ace05-deberta-large
11
+ results:
12
+ - task:
13
+ type: relation-extraction
14
+ dataset:
15
+ name: ace05
16
+ type: ace05
17
+ metrics:
18
+ - name: F1
19
+ type: f1
20
+ value: 71.370
21
+ ---
22
+
23
+
24
+ # ITER: Iterative Transformer-based Entity Recognition and Relation Extraction
25
+
26
+ This model checkpoint is part of the collection of models published alongside our paper ITER,
27
+ [accepted at EMNLP 2024](https://aclanthology.org/2024.findings-emnlp.655/).<br>
28
+ To ease reproducibility and enable open research, our source code has been published on [GitHub](https://github.com/fleonce/iter).
29
+
30
+ This model achieved an F1 score of `71.370` on dataset `ace05`
31
+
32
+ ### Using ITER in your code
33
+
34
+ First, install ITER in your preferred environment:
35
+
36
+ ```text
37
+ pip install git+https://github.com/fleonce/iter
38
+ ```
39
+
40
+ To use our model, refer to the following code:
41
+ ```python
42
+ from iter import ITERForRelationExtraction
43
+
44
+ model = ITERForRelationExtraction.from_pretrained("fleonce/iter-ace05-deberta-large")
45
+ tokenizer = model.tokenizer
46
+
47
+ encodings = tokenizer(
48
+ "An art exhibit at the Hakawati Theatre in Arab east Jerusalem was a series of portraits of Palestinians killed in the rebellion .",
49
+ return_tensors="pt"
50
+ )
51
+
52
+ generation_output = model.generate(
53
+ encodings["input_ids"],
54
+ attention_mask=encodings["attention_mask"],
55
+ )
56
+
57
+ # entities
58
+ print(generation_output.entities)
59
+
60
+ # relations between entities
61
+ print(generation_output.links)
62
+ ```
63
+
64
+ ### Checkpoints
65
+
66
+ We publish checkpoints for the models performing best on the following datasets:
67
+
68
+ - **ACE05**:
69
+ 1. [fleonce/iter-ace05-deberta-large](https://huggingface.co/fleonce/iter-ace05-deberta-large)
70
+ - **CoNLL04**:
71
+ 1. [fleonce/iter-conll04-deberta-large](https://huggingface.co/fleonce/iter-conll04-deberta-large)
72
+ - **ADE**:
73
+ 1. [fleonce/iter-ade-deberta-large](https://huggingface.co/fleonce/iter-ade-deberta-large)
74
+ - **SciERC**:
75
+ 1. [fleonce/iter-scierc-deberta-large](https://huggingface.co/fleonce/iter-scierc-deberta-large)
76
+ 2. [fleonce/iter-scierc-scideberta-full](https://huggingface.co/fleonce/iter-scierc-scideberta-full)
77
+ - **CoNLL03**:
78
+ 1. [fleonce/iter-conll03-deberta-large](https://huggingface.co/fleonce/iter-conll03-deberta-large)
79
+ - **GENIA**:
80
+ 1. [fleonce/iter-genia-deberta-large](https://huggingface.co/fleonce/iter-genia-deberta-large)
81
+
82
+
83
+ ### Reproducibility
84
+
85
+ For each dataset, we selected the best performing checkpoint out of the 5 training runs we performed during training.
86
+ This model was trained with the following hyperparameters:
87
+
88
+ - Seed: `2`
89
+ - Config: `ace05/small_lr_symrel`
90
+ - PyTorch `2.3.0` with CUDA `11.8` and precision `torch.bfloat16`
91
+ - GPU: `1 NVIDIA H100 SXM 80 GB GPU`
92
+
93
+ Varying GPU and CUDA version as well as training precision did result in slightly different end results in our tests
94
+ for reproducibility.
95
+
96
+ To train this model, refer to the following command:
97
+ ```shell
98
+ python3 train.py --dataset ace05/small_lr_symrel --transformer microsoft/deberta-v3-large --use_bfloat16 --seed 2
99
+ ```
100
+
101
+ ```text
102
+ @inproceedings{citation}
103
+ ```
104
+
config.json ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "models/fleonce/iter-ace05-deberta-large",
3
+ "activation_fn": "relu",
4
+ "architectures": [
5
+ "ITERForRelationExtraction"
6
+ ],
7
+ "d_ff": 4096,
8
+ "d_model": 1024,
9
+ "dataset": "ace05",
10
+ "dropout": 0.23395583304970166,
11
+ "entity_types": [
12
+ "PER",
13
+ "FAC",
14
+ "ORG",
15
+ "GPE",
16
+ "VEH",
17
+ "LOC",
18
+ "WEA"
19
+ ],
20
+ "features": 73740,
21
+ "link_types": [
22
+ "PER-SOC",
23
+ "PHYS",
24
+ "PART-WHOLE",
25
+ "ART",
26
+ "ORG-AFF",
27
+ "GEN-AFF"
28
+ ],
29
+ "max_length": 512,
30
+ "max_nest_depth": 1,
31
+ "model_type": "iter",
32
+ "num_links": 6,
33
+ "num_types": 7,
34
+ "threshold": 0.5,
35
+ "torch_dtype": "float32",
36
+ "transformer_config": {
37
+ "_name_or_path": "microsoft/deberta-v3-large",
38
+ "architectures": null,
39
+ "attention_probs_dropout_prob": 0.1,
40
+ "decoder_start_token_id": null,
41
+ "eos_token_id": null,
42
+ "hidden_act": "gelu",
43
+ "hidden_dropout_prob": 0.1,
44
+ "hidden_size": 1024,
45
+ "initializer_range": 0.02,
46
+ "intermediate_size": 4096,
47
+ "is_encoder_decoder": false,
48
+ "layer_norm_eps": 1e-07,
49
+ "max_length": 512,
50
+ "max_position_embeddings": 512,
51
+ "max_relative_positions": -1,
52
+ "model_type": "deberta-v2",
53
+ "norm_rel_ebd": "layer_norm",
54
+ "num_attention_heads": 16,
55
+ "num_hidden_layers": 24,
56
+ "pooler_dropout": 0,
57
+ "pooler_hidden_act": "gelu",
58
+ "pooler_hidden_size": 1024,
59
+ "pos_att_type": [
60
+ "p2c",
61
+ "c2p"
62
+ ],
63
+ "position_biased_input": false,
64
+ "position_buckets": 256,
65
+ "relative_attention": true,
66
+ "share_att_key": true,
67
+ "task_specific_params": null,
68
+ "type_vocab_size": 0,
69
+ "vocab_size": 128100
70
+ },
71
+ "transformers_version": "4.37.0",
72
+ "use_bias": false,
73
+ "use_gate": true,
74
+ "use_mlp": true,
75
+ "use_scale": false
76
+ }
generation_config.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "max_length": 512,
4
+ "transformers_version": "4.37.0"
5
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:129bda5221cf1d516aca7391b2490f7238ae98872629312a95705e44ba8d7a7e
3
+ size 1904099176