himel7/bias-detector · Hugging Face

Citation

If you are using this model, please cite this paper:

@misc{ghosh2025biasbiasdetectingbias,
      title={To Bias or Not to Bias: Detecting bias in News with bias-detector}, 
      author={Himel Ghosh and Ahmed Mosharafa and Georg Groh},
      year={2025},
      eprint={2505.13010},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.13010}, 
}

This is a RoBERTa-based binary classification model fine-tuned on the BABE (URL: https://huggingface.co/datasets/mediabiasgroup/BABE) dataset for bias detection in English news statements. The model predicts whether a given sentence contains biased language (LABEL_1) or is unbiased (LABEL_0). It is intended for applications in media bias analysis, content moderation, and social computing research.

Example usage with Hugging Face’s pipeline:

from transformers import pipeline

classifier = pipeline("text-classification", model="himel7/bias-detector", tokenizer="roberta-base")
result = classifier("Immigrants are criminals.")

Evaluation

The model was evaluated on the entire BABE dataset with a K-fold Cross Validation and yielded the following metrics at K=5:

Accuracy: 0.9202
Precision: 0.9615
Recall: 0.8927
F1 Score: 0.9257

Model Details

Model Description

This model is a fine-tuned version of roberta-base trained to detect linguistic bias in English-language news statements. The task is framed as binary classification: the model outputs LABEL_1 for biased statements and LABEL_0 for non-biased statements.

Fine-tuning was performed on the BABE dataset, which contains annotated news snippets across various topics and political leanings. The annotations focus on whether the language used expresses subjective bias rather than factual reporting.

The goal of this model is to assist in detecting subtle forms of bias in media content, such as emotionally loaded language, stereotypical phrasing, or exaggerated claims, and can be useful in journalistic analysis, media monitoring, or NLP research into framing and stance.

This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

Developed by: Himel Ghosh
Language(s) (NLP): Python
Finetuned from model: roberta-base

Uses

This model is intended to support the detection and analysis of biased language in English news content. It can be used as a tool by:

Media researchers and social scientists studying framing, bias, or political discourse.
Journalists and editors aiming to assess the neutrality of their writing or compare outlets.
Developers integrating bias detection into NLP pipelines for content moderation, misinformation detection, or AI-assisted writing tools.

Foreseeable Uses:
Annotating datasets for bias.
Measuring bias across different news outlets or topics.
Serving as an assistive tool in editorial decision-making or media monitoring.

Direct Use

This model can be used directly for binary classification of English-language news statements to determine whether they exhibit biased language. It returns one of two labels:

LABEL_0 :Non-biased
LABEL_1 : Biased

Bias, Risks, and Limitations

While this model is designed to detect linguistic bias, it carries several limitations and risks, both technical and sociotechnical:

The model was fine-tuned on the BABE dataset, which includes annotations based on human judgments that may reflect specific cultural or political perspectives.
It may not generalize well to non-news text or out-of-domain content (e.g., social media, informal writing).
Subtle forms of bias, sarcasm, irony, or coded language may not be reliably detected.
Bias is inherently subjective: What one annotator considers biased may be seen as neutral by another. The model reflects those subjective judgments.
The model does not detect factual correctness or misinformation — only linguistic bias cues.
Labeling a text as “biased” may have reputational or ethical implications, especially if used in moderation, censorship, or journalistic evaluations.

Training Details

Training Data

Training was done on the BABE Dataset: https://huggingface.co/datasets/mediabiasgroup/BABE

Summary

The model achieved 92.02% Accuracy, with very high Precision of 96.15% and 89.27% Recall. This means the model predicts very few false positives and detects the biases that are actually biases.

himel7
/

bias-detector