license: other
language:
- en
library_name: transformers
pipeline_tag: fill-mask
widget:
- text: >-
While flying a fire, the UAS experienced an issue of unknown sorts and
[MASK] to the ground. From the people watching the aircraft near the fire,
they seem to think it was some sort of motor failure due to no more noise
coming from the aircraft and it falling straight to the ground.
example_title: Example 1
- text: >-
During a pre-flight [MASK] run-up, a battery hatch cover disengaged from
the fuselage and hit one of the vertical takeoff and landing {VTOL}
propellers. The motor failsafe activated and the motors shut down.
example_title: Example 2
- text: >-
UAS was climbing to 11,000 ft. msl on a reconnaissance mission when it
experienced a rapid and uncommanded descent. The [MASK] took no action but
monitored instruments until the aircraft regained a stable profile.
example_title: Example 3
datasets:
- NASA-AIML/ASRS
- NASA-AIML/NTSB_Accidents
Manager for Intelligent Knowledge Acess (MIKA)
SafeAeroBERT: A Safety-Informed Aviation-Specific Langauge Model
base-bert-uncased model first further pre-trained on the set of Aviation Safety Reporting System (ASRS) documents up to November of 2022 and National Trasportation Safety Board (NTSB) accident reports up to November 2022. A total of 2,283,435 narrative sections are split 90/10 for training and validation, with 1,052,207,104 tokens from over 350,000 NTSB and ASRS documents used for pre-training.
The model was trained on two epochs using AutoModelForMaskedLM.from_pretrained
with a learning_rate=1e-5
, and total batch size of 128 for just over 32100 training steps.
An earlier version of the model was evaluted on a downstream binary document classification task by fine-tuning the model with AutoModelForSequenceClassification.from_pretrained
. SafeAeroBERT was compared to SciBERT and base-BERT on this task, with the following performance:
Contributing Factor | Metric | BERT | SciBERT | SafeAeroBERT |
---|---|---|---|---|
Aircraft | Accuracy | 0.747 | 0.726 | 0.740 |
Precision | 0.716 | 0.691 | 0.548 | |
Recall | 0.747 | 0.726 | 0.740 | |
F-1 | 0.719 | 0.699 | 0.629 | |
Human Factors | Accuracy | 0.608 | 0.557 | 0.549 |
Precision | 0.618 | 0.586 | 0.527 | |
Recall | 0.608 | 0.557 | 0.549 | |
F-1 | 0.572* | 0.426 | 0.400 | |
Procedure | Accuracy | 0.766 | 0.755 | 0.845 |
Precision | 0.766 | 0.762 | 0.742 | |
Recall | 0.766 | 0.755 | 0.845 | |
F-1 | 0.766 | 0.758 | 0.784 | |
Weather | Accuracy | 0.807 | 0.808 | 0.871 |
Precision | 0.803 | 0.769 | 0.759 | |
Recall | 0.807 | 0.808 | 0.871 | |
F-1 | 0.805 | 0.788 | 0.811 |
More infomation on training data, evaluation, and intended use can be found in the original publication
Citation: Sequoia R. Andrade and Hannah S. Walsh. "SafeAeroBERT: Towards a Safety-Informed Aerospace-Specific Language Model," AIAA 2023-3437. AIAA AVIATION 2023 Forum. June 2023.