ACCORD-NLP

ACCORD-NLP is a Natural Language Processing (NLP) framework developed as part of the Horizon European project for Automated Compliance Checks for Construction, Renovation or Demolition Works (ACCORD) to facilitate Automated Compliance Checking (ACC) within the Architecture, Engineering, and Construction (AEC) sector. It consists of several pre-trained/fine-tuned machine learning models to perform the following information extraction tasks from regulatory text.

Entity Extraction/Classification (ner)
Relation Extraction/Classification (re)

roberta-large-lm is a domain-specific RoBERTa large model/RoBERTa large model pre-trained on a building regulatory text corpus using the Masked Language Modelling (MLM) objective. This needs to be fine-tuned for a downstream task such as entity or relation classification.

Installation

From Source

git clone https://github.com/Accord-Project/accord-nlp.git
cd accord-nlp
pip install -r requirements.txt

From pip

pip install accord-nlp

Using Pre-trained Models

Entity Extraction/Classification (ner)

from accord_nlp.text_classification.ner.ner_model import NERModel

model = NERModel('roberta', 'ACCORD-NLP/ner-roberta-large')
predictions, raw_outputs = model.predict(['The gradient of the passageway should not exceed five per cent.'])
print(predictions)

Relation Extraction/Classification (re)

from accord_nlp.text_classification.relation_extraction.re_model import REModel

model = REModel('roberta', 'ACCORD-NLP/re-roberta-large')
predictions, raw_outputs = model.predict(['The <e1>gradient<\e1> of the passageway should not exceed <e2>five per cent</e2>.'])
print(predictions)

For more details, please refer to the ACCORD-NLP GitHub repository.