README.md · samirmsallem/distilbert-base-multilingual-cased-definitions

metadata

datasets:
  - samirmsallem/wiki_def_de_multitask
language:
  - de
base_model:
  - distilbert/distilbert-base-multilingual-cased
library_name: transformers
tags:
  - science
  - ner
  - def_extraction
  - definitions
metrics:
  - precision
  - recall
  - f1
  - accuracy
model-index:
  - name: checkpoints
    results:
      - task:
          name: Token Classification
          type: token-classification
        dataset:
          name: samirmsallem/wiki_def_de_multitask
          type: samirmsallem/wiki_def_de_multitask
        metrics:
          - name: F1
            type: f1
            value: 0.812455003599712
          - name: Precision
            type: precision
            value: 0.8076097328244275
          - name: Recall
            type: recall
            value: 0.8173587638821825
          - name: Loss
            type: loss
            value: 0.329479843378067

NER model for definition component recognition in German scientific texts

distilbert-base-multilingual-cased-definitions_ner is a NER model (token classification) in the scientific domain in German, finetuned from the model distilbert-base-multilingual-cased. It was trained using a custom annotated dataset of around 10,000 training and 2,000 test examples containing definition- and non-definition-related sentences from wikipedia articles in german.

The model is specifically designed to recognize and classify components of definitions, using the following entity labels:

DF: Definiendum (the term being defined)
VF: Definitor (the verb or phrase introducing the definition)
GF: Definiens (the explanation or meaning)

Training was conducted using a standard NER objective. The model achieves an F1 score of approximately 81% on the evaluation set.

Here are the overall final metrics on the test dataset after 5 epochs of training:

f1: 0.812455003599712
precision: 0.8076097328244275
recall: 0.8173587638821825
loss: 0.329479843378067

Model Performance Comparision on wiki_definitions_de_multitask:

Model	Precision	Recall	F1 Score	Eval Samples per Second	Epoch
distilbert-base-multilingual-cased-definitions_ner	80.76	81.74	81.25	457.53	5.0
scibert_scivocab_cased-definitions_ner	80.54	82.11	81.32	236.61	4.0
GottBERT_base_best-definitions_ner	82.98	82.81	82.90	272.26	5.0
xlm-roberta-base-definitions_ner	81.90	83.35	82.62	241.21	5.0
gbert-base-definitions_ner	82.73	83.56	83.14	278.87	5.0