model base: https://huggingface.co/microsoft/deberta-v3-base

dataset: https://github.com/ramybaly/Article-Bias-Prediction

training parameters:

  • devices: 2xH100
  • batch_size: 100
  • epochs: 5
  • dropout: 0.05
  • max_length: 512
  • learning_rate: 3e-5
  • warmup_steps: 100
  • random_state: 239

training methodology:

  • sanitize dataset following specific rule-set, utilize random split as provided in the dataset
  • train on train split and evaluate on validation split in each epoch
  • evaluate test split only on the model that performed best on validation loss

result summary:

  • throughout the five training epochs, model of fourth epoch achieved the lowest validation loss of 0.1909
  • on test split fourth epoch model achieved f1 score of 0.9427 and a test loss of 0.2168

usage:

model = AutoModelForSequenceClassification.from_pretrained("premsa/political-bias-prediction-allsides-DeBERTa")
tokenizer = AutoTokenizer.from_pretrained("premsa/political-bias-prediction-allsides-DeBERTa")
nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)
print(nlp("the masses are controlled by media."))
Downloads last month
22
Safetensors
Model size
184M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support