File size: 5,101 Bytes

---
library_name: transformers
base_model: google-bert/bert-base-chinese
tags:
- generated_from_trainer
datasets:
- peoples_daily_ner
metrics:
- f1
model-index:
- name: models_for_ner
  results:
  - task:
      type: token-classification
      name: Token Classification
    dataset:
      name: peoples_daily_ner
      type: peoples_daily_ner
      config: peoples_daily_ner
      split: validation
      args: peoples_daily_ner
    metrics:
    - type: f1
      value: 0.9508438253415484
      name: F1
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# models_for_ner

This model is a fine-tuned version of [google-bert/bert-base-chinese](https://huggingface.co/google-bert/bert-base-chinese) on the peoples_daily_ner dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0219
- F1: 0.9508

## Model description

### 使用方法(pipline的方法)

```python
from transformers import pipeline

ner_pipe = pipeline('token-classification', model='roberthsu2003/models_for_ner',aggregation_strategy='simple')
inputs = '徐國堂在台北上班'
res = ner_pipe(inputs)
print(res)
res_result = {}
for r in res:
    entity_name = r['entity_group']
    start = r['start']
    end = r['end']
    if entity_name not in res_result:
        res_result[entity_name] = []
    res_result[entity_name].append(inputs[start:end])

res_result
#==output==
{'PER': ['徐國堂'], 'LOC': ['台北']}
```

### 使用方法(model,tokenizer)

```python
from transformers import AutoModelForTokenClassification, AutoTokenizer
import numpy as np

# Load the pre-trained model and tokenizer
model = AutoModelForTokenClassification.from_pretrained('roberthsu2003/models_for_ner')
tokenizer = AutoTokenizer.from_pretrained('roberthsu2003/models_for_ner')

# The label mapping (you might need to adjust this based on your training)
#['O', 'B-PER', 'I-PER', 'B-ORG', 'I-ORG', 'B-LOC', 'I-LOC']
label_list = list(model.config.id2label.values())


def predict_ner(text):
    """Predicts NER tags for a given text using the loaded model."""
    # Encode the text
    inputs = tokenizer(text, return_tensors='pt', truncation=True, padding=True)
    
    # Get model predictions
    outputs = model(**inputs)
    predictions = np.argmax(outputs.logits.detach().numpy(), axis=-1)
    
    # Get the word IDs from the encoded inputs
    # This is the key change - word_ids() is a method on the encoding result, not the tokenizer itself
    word_ids = inputs.word_ids(batch_index=0)
    
    pred_tags = []
    for word_id, pred in zip(word_ids, predictions[0]):
        if word_id is None:
            continue  # Skip special tokens
        pred_tags.append(label_list[pred])

    return pred_tags

#To get the entities, you'll need to group consecutive non-O tags:

def get_entities(tags):
    """Groups consecutive NER tags to extract entities."""
    entities = []
    start_index = -1
    current_entity_type = None
    for i, tag in enumerate(tags):
        if tag != 'O':
            if start_index == -1:
                start_index = i
                current_entity_type = tag[2:] # Extract entity type (e.g., PER, LOC, ORG)
        else: #tag == 'O'
            if start_index != -1:
                entities.append((start_index, i, current_entity_type))
                start_index = -1
                current_entity_type = None
    if start_index != -1:
        entities.append((start_index, len(tags), current_entity_type))
    return entities

# Example usage:
text = "徐國堂在台北上班"
ner_tags = predict_ner(text)
print(f"Text: {text}")
#==output==
#Text: 徐國堂在台北上班


print(f"NER Tags: {ner_tags}")
#===output==
#NER Tags: ['B-PER', 'I-PER', 'I-PER', 'O', 'B-LOC', 'I-LOC', 'O', 'O']


entities = get_entities(ner_tags)
word_tokens = tokenizer.tokenize(text)  # Tokenize to get individual words
print(f"Entities:")
for start, end, entity_type in entities:
    entity_text = "".join(word_tokens[start:end])
    print(f"- {entity_text}: {entity_type}")

#==output==
#Entities:
#- 徐國堂: PER
#- 台北: LOC
```

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 64
- eval_batch_size: 128
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss | F1     |
|:-------------:|:-----:|:----:|:---------------:|:------:|
| 0.0274        | 1.0   | 327  | 0.0204          | 0.9510 |
| 0.0127        | 2.0   | 654  | 0.0174          | 0.9592 |
| 0.0063        | 3.0   | 981  | 0.0186          | 0.9602 |


### Framework versions

- Transformers 4.48.3
- Pytorch 2.5.1+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0