To Bias or Not to Bias: Detecting bias in News with bias-detector
Abstract
A RoBERTa-based model fine-tuned on the BABE dataset shows improved performance in sentence-level media bias detection compared to a domain-adaptively pre-trained DA-RoBERTa baseline, with improved attention to contextually relevant tokens.
Media bias detection is a critical task in ensuring fair and balanced information dissemination, yet it remains challenging due to the subjectivity of bias and the scarcity of high-quality annotated data. In this work, we perform sentence-level bias classification by fine-tuning a RoBERTa-based model on the expert-annotated BABE dataset. Using McNemar's test and the 5x2 cross-validation paired t-test, we show statistically significant improvements in performance when comparing our model to a domain-adaptively pre-trained DA-RoBERTa baseline. Furthermore, attention-based analysis shows that our model avoids common pitfalls like oversensitivity to politically charged terms and instead attends more meaningfully to contextually relevant tokens. For a comprehensive examination of media bias, we present a pipeline that combines our model with an already-existing bias-type classifier. Our method exhibits good generalization and interpretability, despite being constrained by sentence-level analysis and dataset size because of a lack of larger and more advanced bias corpora. We talk about context-aware modeling, bias neutralization, and advanced bias type classification as potential future directions. Our findings contribute to building more robust, explainable, and socially responsible NLP systems for media bias detection.
Community
This paper presents the sentence-level bias detection model himel7/bias-detector available on Hugging Face for bias-detection in news articles.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Frame In, Frame Out: Do LLMs Generate More Biased News Headlines than Humans? (2025)
- A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings (2025)
- Neutralizing the Narrative: AI-Powered Debiasing of Online News Articles (2025)
- FinNLI: Novel Dataset for Multi-Genre Financial Natural Language Inference Benchmarking (2025)
- Pushing the boundary on Natural Language Inference (2025)
- Classifier-to-Bias: Toward Unsupervised Automatic Bias Detection for Visual Classifiers (2025)
- Comparing LLM Text Annotation Skills: A Study on Human Rights Violations in Social Media Data (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 1
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper