Post
2105
π’ Deligted to share the most recent milestone on quick deployment of Named Entity Recognition (NER) in Gen-AI powered systems.
Releasing the bulk-ner 0.25.0 which represent a tiny framework that would save you time for deploing NER with any model.
π Why is this important? In the era of GenAI the handling out textual output might be challenging. Instead, recognizing named-entities via domain-oriented systems for your donwstream LLM would be preferable option.
π¦: https://pypi.org/project/bulk-ner/0.25.0/
π: https://github.com/nicolay-r/bulk-ner
I noticed that the direct adaptaion of the LM for NER would result in spending signifcant amount of time on formatting your texts according to the NER-model needs.
In particular:
1. Processing CONLL format with B-I-O tags from model outputs
2. Input trimming: long input content might not be completely fitted
To cope with these problems, in version 0.25.0 I made a huge steps forward by providing:
β π Python API support: see screenshot below for a quick deployment (see screenshot below πΈ)
β πͺΆ No-string: dependencies are now clear, so it is purely Python implementation for API calls.
β π Simplified output formatting: we use lists to represent texts with inner lists that refer to annotated objects (see screenshot below πΈ)
π We have a colab for a quick start here (or screenshot for bash / Python API πΈ)
https://colab.research.google.com/github/nicolay-r/ner-service/blob/main/NER_annotation_service.ipynb
π The code for pipeline deployment is taken from the AREkit project:
https://github.com/nicolay-r/AREkit
Releasing the bulk-ner 0.25.0 which represent a tiny framework that would save you time for deploing NER with any model.
π Why is this important? In the era of GenAI the handling out textual output might be challenging. Instead, recognizing named-entities via domain-oriented systems for your donwstream LLM would be preferable option.
π¦: https://pypi.org/project/bulk-ner/0.25.0/
π: https://github.com/nicolay-r/bulk-ner
I noticed that the direct adaptaion of the LM for NER would result in spending signifcant amount of time on formatting your texts according to the NER-model needs.
In particular:
1. Processing CONLL format with B-I-O tags from model outputs
2. Input trimming: long input content might not be completely fitted
To cope with these problems, in version 0.25.0 I made a huge steps forward by providing:
β π Python API support: see screenshot below for a quick deployment (see screenshot below πΈ)
β πͺΆ No-string: dependencies are now clear, so it is purely Python implementation for API calls.
β π Simplified output formatting: we use lists to represent texts with inner lists that refer to annotated objects (see screenshot below πΈ)
π We have a colab for a quick start here (or screenshot for bash / Python API πΈ)
https://colab.research.google.com/github/nicolay-r/ner-service/blob/main/NER_annotation_service.ipynb
π The code for pipeline deployment is taken from the AREkit project:
https://github.com/nicolay-r/AREkit