Understanding Gen Alpha Digital Language: Evaluation of LLM Safety Systems for Content Moderation
Abstract
Four leading AI models show significant comprehension failures in detecting masked harassment in Gen Alpha digital communication, underscoring the need for improved safety systems.
This research offers a unique evaluation of how AI systems interpret the digital language of Generation Alpha (Gen Alpha, born 2010-2024). As the first cohort raised alongside AI, Gen Alpha faces new forms of online risk due to immersive digital engagement and a growing mismatch between their evolving communication and existing safety tools. Their distinct language, shaped by gaming, memes, and AI-driven trends, often conceals harmful interactions from both human moderators and automated systems. We assess four leading AI models (GPT-4, Claude, Gemini, and Llama 3) on their ability to detect masked harassment and manipulation within Gen Alpha discourse. Using a dataset of 100 recent expressions from gaming platforms, social media, and video content, the study reveals critical comprehension failures with direct implications for online safety. This work contributes: (1) a first-of-its-kind dataset capturing Gen Alpha expressions; (2) a framework to improve AI moderation systems for youth protection; (3) a multi-perspective evaluation including AI systems, human moderators, and parents, with direct input from Gen Alpha co-researchers; and (4) an analysis of how linguistic divergence increases youth vulnerability. Findings highlight the urgent need to redesign safety systems attuned to youth communication, especially given Gen Alpha reluctance to seek help when adults fail to understand their digital world. This study combines the insight of a Gen Alpha researcher with systematic academic analysis to address critical digital safety challenges.
Community
Primary author is a 13y old middle schooler analyzing Gen Alpha language understanding by LLMs and the implications on AI moderation. Will be presented at ACM FAccT in June 2025
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Reflexive Prompt Engineering: A Framework for Responsible Prompt Engineering and Interaction Design (2025)
- Phare: A Safety Probe for Large Language Models (2025)
- VLM as Policy: Common-Law Content Moderation Framework for Short Video Platform (2025)
- TaMPERing with Large Language Models: A Field Guide for using Generative AI in Public Administration Research (2025)
- RealHarm: A Collection of Real-World Language Model Application Failures (2025)
- Linguistic Comparison of AI- and Human-Written Responses to Online Mental Health Queries (2025)
- MoMoE: Mixture of Moderation Experts Framework for AI-Assisted Online Governance (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper