BERT Fine-Tuned for Named Entity Recognition (CoNLL-2003)

This model recognizes named entities in English text: People, Organizations, Locations, and Miscellaneous entities.

Model Details

  • Base model: bert-base-cased
  • Dataset: CoNLL-2003 (14,041 training sentences from Reuters news)
  • Task: Named Entity Recognition (token classification)
  • Framework: PyTorch + HuggingFace Transformers

Entity Types

Label Meaning Example
PER Person names Barack Obama, Elon Musk
ORG Organizations Apple Inc., United Nations
LOC Locations New York, Mount Everest
MISC Miscellaneous English, FIFA World Cup

Performance (CoNLL-2003 Test Set)

Metric Score
F1 Score 0.9116
Precision 0.9041
Recall 0.9192
Accuracy 0.9827

How to Use

from transformers import pipeline

# Load the model
ner = pipeline(
    "token-classification",
    model="samandar1105/named_entity-recognition",
    aggregation_strategy="simple"
)

# Run inference
result = ner("Elon Musk founded SpaceX in Hawthorne, California.")
print(result)
# [
#   {'entity_group': 'PER', 'word': 'Elon Musk', 'score': 0.998},
#   {'entity_group': 'ORG', 'word': 'SpaceX', 'score': 0.997},
#   {'entity_group': 'LOC', 'word': 'Hawthorne', 'score': 0.995},
#   {'entity_group': 'LOC', 'word': 'California', 'score': 0.994},
# ]

Training Details

  • Learning rate: 2e-5
  • Epochs: 4
  • Batch size: 16
  • Max sequence length: 128
  • Warmup ratio: 0.1
  • Weight decay: 0.01
  • Label alignment: First-subword strategy with -100 for continuation subwords
  • Evaluation: seqeval (entity-level strict span matching)
Downloads last month
58
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train samandar1105/named_entity-recognition

Space using samandar1105/named_entity-recognition 1

Evaluation results