Article

DETECTING CYBERBULLYING BOTS USING HYBRID CNN AND ENHANCED TEXT FEATURES

Author : K. Manohar Rao, Hruthik Sheelam, Aravind Seelam, Ajay Erumandla

DOI : http://doi.org/10.63590/jsetms.2025.v02.i07(S).pp182-189

Bot-driven cyberbullying has become a growing concern in digital communication, with statistics indicating that over 59% of teenagers have encountered online harassment—most commonly on social media platforms, accounting for over 70% of reported cases. Multi-class cyberbullying datasets typically contain up to 25,000 labeled entries across categories such as insults, threats, racism, and sexism, often suffering from severe class imbalance and linguistic diversity. Manual detection techniques are plagued by inconsistencies, subjectivity, and limited scalability amidst the rapidly growing volume of user-generated content. Traditional machine learning models struggle with shallow feature representations, poor performance on minority classes, and difficulty in detecting nuanced or implicit abuse. Additionally, current literature often overlooks the integration of deep ensemble learning with optimized preprocessing and contextual analysis. To overcome these challenges, this study introduces a Hybrid Multiclass Unmasking Bot Classification framework. The proposed approach combines feature-enriched N-gram extraction with a dual deep learning architecture incorporating both Deep Neural Networks (DNN) and Convolutional Neural Networks (CNN). The pipeline begins with comprehensive dataset ingestion and Exploratory Data Analysis (EDA) to understand class distribution and data imbalance. Text preprocessing, including tokenization, lemmatization, and noise elimination, is followed by vectorization using TF-IDF with bi-gram support, enabling the capture of both isolated and contextual semantics. The DNN component captures abstract semantic relationships, while the CNN detects local linguistic cues. This dual-stream structure promotes robust learning across varying categories of cyberbullying bots. Performance evaluation demonstrates that the proposed system significantly outperforms conventional classifiers in terms of precision, recall, and F1-score across all classes


Full Text Attachment
//