arxiv:2505.13010

To Bias or Not to Bias: Detecting bias in News with bias-detector

Published on May 19

· Submitted by

himel7 on May 21

Upvote

Authors:

Himel Ghosh ,

Ahmed Mosharafa ,

Georg Groh

Abstract

A RoBERTa-based model fine-tuned on the BABE dataset shows improved performance in sentence-level media bias detection compared to a domain-adaptively pre-trained DA-RoBERTa baseline, with improved attention to contextually relevant tokens.

AI-generated summary

Media bias detection is a critical task in ensuring fair and balanced information dissemination, yet it remains challenging due to the subjectivity of bias and the scarcity of high-quality annotated data. In this work, we perform sentence-level bias classification by fine-tuning a RoBERTa-based model on the expert-annotated BABE dataset. Using McNemar's test and the 5x2 cross-validation paired t-test, we show statistically significant improvements in performance when comparing our model to a domain-adaptively pre-trained DA-RoBERTa baseline. Furthermore, attention-based analysis shows that our model avoids common pitfalls like oversensitivity to politically charged terms and instead attends more meaningfully to contextually relevant tokens. For a comprehensive examination of media bias, we present a pipeline that combines our model with an already-existing bias-type classifier. Our method exhibits good generalization and interpretability, despite being constrained by sentence-level analysis and dataset size because of a lack of larger and more advanced bias corpora. We talk about context-aware modeling, bias neutralization, and advanced bias type classification as potential future directions. Our findings contribute to building more robust, explainable, and socially responsible NLP systems for media bias detection.