Update README.md
Browse files
README.md
CHANGED
@@ -10,17 +10,16 @@ datasets:
|
|
10 |
|
11 |
This is an evolution of https://huggingface.co/aehrc/cxrmate developed for the Radiology Report Generation task of BioNLP @ ACL 2024.
|
12 |
|
13 |
-
For this, proposed EAST: Entropy-Augmented Self-critical sequence Training (EAST)
|
14 |
-
EAST modifies Self-Critical Sequence Training (SCST) by adding entropy regularisation.
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
We also utilise an attention mask with non-causal masking for the image embeddings and a causal mask for the report token embeddings.
|
24 |
|
25 |
## How to use:
|
26 |
|
@@ -55,6 +54,9 @@ output_ids = model.generate(
|
|
55 |
findings, impression = model.split_and_decode_sections(output_ids, tokenizer)
|
56 |
```
|
57 |
|
|
|
|
|
|
|
58 |
## Paper:
|
59 |
|
60 |
## Citation:
|
|
|
10 |
|
11 |
This is an evolution of https://huggingface.co/aehrc/cxrmate developed for the Radiology Report Generation task of BioNLP @ ACL 2024.
|
12 |
|
13 |
+
For this, we proposed EAST: Entropy-Augmented Self-critical sequence Training (EAST):
|
14 |
+
- EAST modifies Self-Critical Sequence Training (SCST) by adding entropy regularisation.
|
15 |
+
- Helps maintain a higher entropy in the token distribution.
|
16 |
+
- Preventing overfitting to common phrases and ensuring a broader exploration of the vocabulary during training.
|
17 |
+
- This was essential to handle the diversity of the radiology reports in the RRG24 datasets.
|
18 |
+
|
19 |
+
EAST was applied to a multimodal language model with RadGraph as the reward. Other features include:
|
20 |
+
- Token type embeddings to differentiate between findings and impression section tokens, as well as image embeddings.
|
21 |
+
- Special tokens (`NF` and `NI`) to handle missing *findings* and *impression* sections.
|
22 |
+
- Non-causal attention masking for the image embeddings and a causal attention masking for the report token embeddings.
|
|
|
23 |
|
24 |
## How to use:
|
25 |
|
|
|
54 |
findings, impression = model.split_and_decode_sections(output_ids, tokenizer)
|
55 |
```
|
56 |
|
57 |
+
## Notebook example:
|
58 |
+
https://huggingface.co/aehrc/cxrmate-rrg24/blob/main/demo.ipynb
|
59 |
+
|
60 |
## Paper:
|
61 |
|
62 |
## Citation:
|