Feature Extraction
Transformers
Safetensors
vision-encoder-decoder
custom_code
anicolson commited on
Commit
c0566dd
1 Parent(s): 09a7901

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -101,6 +101,7 @@ There is no penalty in the reward for sampled reports that differ in length to t
101
 
102
  ## Citation:
103
 
 
104
  @inproceedings{nicolson-etal-2024-e,
105
  title = "e-Health {CSIRO} at {RRG}24: Entropy-Augmented Self-Critical Sequence Training for Radiology Report Generation",
106
  author = "Nicolson, Aaron and
@@ -122,3 +123,5 @@ There is no penalty in the reward for sampled reports that differ in length to t
122
  pages = "99--104",
123
  abstract = "The core novelty of our approach lies in the addition of entropy regularisation to self-critical sequence training. This helps maintain a higher entropy in the token distribution, preventing overfitting to common phrases and ensuring a broader exploration of the vocabulary during training, which is essential for handling the diversity of the radiology reports in the RRG24 datasets. We apply this to a multimodal language model with RadGraph as the reward. Additionally, our model incorporates several other aspects. We use token type embeddings to differentiate between findings and impression section tokens, as well as image embeddings. To handle missing sections, we employ special tokens. We also utilise an attention mask with non-causal masking for the image embeddings and a causal mask for the report token embeddings.",
124
  }
 
 
 
101
 
102
  ## Citation:
103
 
104
+ ```
105
  @inproceedings{nicolson-etal-2024-e,
106
  title = "e-Health {CSIRO} at {RRG}24: Entropy-Augmented Self-Critical Sequence Training for Radiology Report Generation",
107
  author = "Nicolson, Aaron and
 
123
  pages = "99--104",
124
  abstract = "The core novelty of our approach lies in the addition of entropy regularisation to self-critical sequence training. This helps maintain a higher entropy in the token distribution, preventing overfitting to common phrases and ensuring a broader exploration of the vocabulary during training, which is essential for handling the diversity of the radiology reports in the RRG24 datasets. We apply this to a multimodal language model with RadGraph as the reward. Additionally, our model incorporates several other aspects. We use token type embeddings to differentiate between findings and impression section tokens, as well as image embeddings. To handle missing sections, we employ special tokens. We also utilise an attention mask with non-causal masking for the image embeddings and a causal mask for the report token embeddings.",
125
  }
126
+ ```
127
+