PyTorch
wjbmattingly commited on
Commit
6be5452
·
1 Parent(s): 10d8445

updated readme

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md CHANGED
@@ -1,3 +1,54 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+
5
+ This is the Placing the Holocaust's finetuned GliNER small model. GLiNER is a Named Entity Recognition (NER) model capable of identifying any entity type using a bidirectional transformer encoder (BERT-like). It provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios.
6
+
7
+ ## Links
8
+
9
+ * Original GliNER model: https://huggingface.co/urchade/gliner_small-v2.1
10
+ * Finetuned with this data: https://huggingface.co/datasets/placingholocaust/spacy-project
11
+ * GliNER paper: https://arxiv.org/abs/2311.08526
12
+ * GliNER Repository: https://github.com/urchade/GLiNER
13
+
14
+ ## Labels
15
+
16
+ | Category | Definition | Examples |
17
+ |---------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
18
+ | **building** | Includes references to physical structures and places of labor or employment like factories. Institutions such as the "Judenrat" or "Red Cross" are also included. | school, home, house, hospital, factory, station, office, store, synagogue, barracks |
19
+ | **country** | Mostly country names, also includes "earth," "country," and "world." Distinguished from Region and Environmental feature based on context. | germany, poland, states, israel, united, country, america, england, france, russia |
20
+ | **dlf (distinct landscape feature)** | Places not large enough to be a geographic or populated region but too large to be an Object, includes parts of buildings like "roof" or "chimney." | street, door, border, line, farm, window, streets, road, wall, field |
21
+ | **env feature (environmental feature)** | Any named or unnamed environmental feature, including bodies of water and landforms. General references like "nature" and "water" are included. | woods, forest, river, mountains, ground, trees, water, tree, mountain, sea |
22
+ | **interior space** | References to distinct rooms within a building, or large place features of a building like a "factory floor." | room, apartment, floor, kitchen, rooms, gas, basement, bathroom, chambers, bunker |
23
+ | **imaginary** | Difficult terms that are context-dependent like "inside," "outside," or "side." Also includes unspecified locations like "community," and conceptual places like "hell" or "heaven." | place, outside, places, side, inside, hiding, hell, heaven, part, spot |
24
+ | **populated place** | Includes cities, towns, villages, and hamlets or crossroads settlements. Names of places can be the same as a ghetto, camp, city, or district. | camp, ghetto, town, city, auschwitz, camps, new, york, concentration, village |
25
+ | **region** | Sub-national regions, states, provinces, or islands. Includes references to sides of a geopolitical border or military zone. | area, side, land, siberia, new, zone, jersey, california, russian, eastern |
26
+ | **spatial object** | Objects of conveyance and movable objects like furniture. In specific contexts, refers to transportation vehicles or items like "ovens," where the common use case of the term prevails. | train, car, ship, boat, bed, truck, trains, cars, trucks |
27
+
28
+
29
+ ## Installation
30
+ To use this model, you must install the GLiNER Python library:
31
+
32
+ ```bash
33
+ !pip install gliner
34
+ ```
35
+
36
+ ## Usage
37
+ Once you've downloaded the GLiNER library, you can import the GLiNER class. You can then load this model using `GLiNER.from_pretrained` and predict entities with `predict_entities`.
38
+
39
+ ```python
40
+ from gliner import GLiNER
41
+
42
+ model = GLiNER.from_pretrained("placingholocaust/gliner_small-v2.1-holocaust")
43
+
44
+ text = """
45
+ Okay. So now it's spring of '44? A: ‘4, And she says, You're going to go to Brzezinka. I said, What is Brzezinka? She said, It's a crematorium and the gas chamber. They have a half a million Hungarian Jews are coming in. That's when the time they -- and they need people to select. We do not select the people to -- who die or not. The women fold the clothes and look for jewelry and make packages to send it to Germany.
46
+ """
47
+
48
+ labels = ["dlf", "populated place", "country", "region", "interior space", "env feature", "building", "spatial object"]
49
+
50
+ entities = model.predict_entities(text, labels)
51
+
52
+ for entity in entities:
53
+ print(entity["text"], "=>", entity["label"])
54
+ ```