Artificial Intelligence News & Insights

An IIT Madras-backed initiative is set to train one million teachers by 2027, integrating artificial intelligence…

AI Artificial Intelligence Fact-Check Science Technology

Can AI Detect Hate Speech in Local Languages? This Groundbreaking Research Says Yes

A new AI system trained on 29,361 real social media comments achieves a 97% F1-score in detecting hostile content in Gujarati — one of the first large-scale efforts for low-resource Indian languages.

By Dr. Jagruti Boda | Research Area: Hostile Post Detection, Gujarati NLP, Transformer Models

The Growing Threat of Harmful Content on Social Media

Social media has become an important space where people share opinions, news, and ideas. Unfortunately, it is also increasingly used to spread harmful content such as hate speech, fake news, offensive language, and defamation. Because millions of posts are shared every day, it is impossible for humans to manually check and control all such content. Therefore, there is a growing need for automated systems that can detect harmful posts.

Why Gujarati? Bridging the NLP Resource Gap for 55 Million Speakers

My research focuses on detecting hostile content in the Gujarati language, which is spoken by more than 55 million people. Despite its wide usage, Gujarati has very limited datasets and research resources compared to English and other major languages.

Introducing GuHPD: A New Gujarati Hostile Post Detection Dataset

To address this gap, I developed a new dataset called GuHPD (Gujarati Hostile Post Detection) containing 29,361 Gujarati social media comments. Each comment was manually labeled as either Hostile or Non-Hostile, and hostile posts were further categorized into Hate, Fake, Offensive, and Defamation.

Smart Data Augmentation: How Transliteration Outperformed Translation

Since labeled Gujarati data is limited, I used data augmentation techniques to expand the dataset. One method translated Gujarati text into English and back again, while another used transliteration (converting Gujarati text into English script and then back to Gujarati). The transliteration approach performed better because it preserved the meaning of the original sentences more effectively.

From Preprocessing to Prediction: The Machine Learning Pipeline

After cleaning and preprocessing the data — including converting emojis into descriptive text — I trained several machine learning models to detect hostile content. While traditional models showed reasonable performance, I improved the results using ensemble methods and advanced transformer-based AI models such as Multilingual BERT, IndicBERT, and GujaratiBERT.

GujaratiBERT Achieves 97% F1-Score in Coarse and Fine-Grained Classification

Among these, GujaratiBERT achieved the best performance, reaching an F1-score of 0.97 for binary classification and 0.90 for detailed multi-label classification.

Making AI Explainable: How LIME Reveals Why a Post Is Flagged as Harmful

To make the system more transparent and trustworthy, I also used an Explainable AI technique called LIME, which highlights the specific words in a post that influence the model’s decision. This helps researchers and policymakers understand why the system identifies a post as harmful.

A Blueprint for Safer Online Spaces and Low-Resource Indian Language AI

Overall, this research provides one of the first large-scale resources and AI approaches for detecting hostile content in Gujarati social media. The work can help support safer online environments, digital literacy efforts, and responsible AI systems, and it can also be extended to other Indian languages with limited resources.

Read the Full Research here:

https://www.ije.ir/article_187454.html

Dr. Jagruti Boda Sarkhedi is an Assistant Professor in the Department of Computer and IT Engineering at UPL University of Sustainable Technology, Gujarat. She completed her Ph.D. from Gujarat Technological University (GTU) in the area of Artificial Intelligence and Natural Language Processing. Her research focuses on hostile content detection in Indian languages, particularly Gujarati social media analysis using machine learning and deep learning techniques. She has published research papers in Scopus and Web of Science indexed journals and has received the Best Research Paper Award for her work in AI-based social media analysis.

AI Artificial Intelligence Science Technology

AI Powered Autonomous Robot Developed by Indian Students Secures Patent

January 22, 2026

Science Matters

Surat, India: In a notable milestone for innovation emerging from Indian campuses, undergraduate students from Sarvajanik…

AI Artificial Intelligence Biodiversity Climate Conservation Environment Health Science Technology

World Conference of Science Journalists 2025 concludes with urgent calls for equitable science reporting amid global crises

December 10, 2025

Science Matters

The 13th World Conference of Science Journalists (WCSJ 2025), hosted at the CSIR International Convention Centre…

AI Artificial Intelligence Science Technology

From Stadiums to Screens: How Data & Tech Are Reinventing Sports

November 20, 2025

Science Matters

By Divyajot Ahluwalia, Founder – Director, wTVision Solutions A full stadium, the sharp whistle of the…

AI Artificial Intelligence Events Media Literacy Science Technology

Science Matters at UNU Macau AI Conference 2025: Strengthening Truth in the Digital Age

November 7, 2025

Science Matters

The digital future hinges not just on technological advancement, but on trust. That was the urgent…

AI Artificial Intelligence Events Science Technology

NASA International Space Apps Challenge

September 16, 2025

Science Matters

The NASA Space Apps Challenge is a global hackathon organized by NASA that invites people from…

Artificial Intelligence Health Science Technology

Coca‑Cola Joins MIT AI Impact Consortium to Tackle Real-World Problems With Tech

September 13, 2025

Science Matters

By Nivash Jeevanandam The Coca-Cola Company has joined the MIT Generative AI Impact Consortium as a…

AI Artificial Intelligence Events Technology

Science Matters Strengthens Digital Readiness of Defence Officers with AI Workshop in Gurugram

September 3, 2025

Science Matters

Gurugram: In a forward-looking initiative to bridge technology and national security, Science Matters hosted an advanced…

AI Artificial Intelligence Health Science Technology

Ark+: This new Open-Source AI Is Making Chest X-Rays Smarter, Faster, and More Accurate

July 18, 2025

Science Matters

Can AI help doctors save lives? A team at…

Artificial Intelligence

IIT Madras-Backed Initiative launched to Train One Million Teachers in AI by 2027

Can AI Detect Hate Speech in Local Languages? This Groundbreaking Research Says Yes

AI Powered Autonomous Robot Developed by Indian Students Secures Patent

World Conference of Science Journalists 2025 concludes with urgent calls for equitable science reporting amid global crises

From Stadiums to Screens: How Data & Tech Are Reinventing Sports

Science Matters at UNU Macau AI Conference 2025: Strengthening Truth in the Digital Age

NASA International Space Apps Challenge

Coca‑Cola Joins MIT AI Impact Consortium to Tackle Real-World Problems With Tech

Science Matters Strengthens Digital Readiness of Defence Officers with AI Workshop in Gurugram

Ark+: This new Open-Source AI Is Making Chest X-Rays Smarter, Faster, and More Accurate

IIT Madras-Backed Initiative launched to Train One Million Teachers in AI by 2027

The $30 Trillion Blank Slate: Inside the Scientific Blueprint to Build a Net-Zero India from Scratch

Can AI Detect Hate Speech in Local Languages? This Groundbreaking Research Says Yes

Nallah Maar – A requiem for Srinagar’s buried stream

IIT Madras-Backed Initiative launched to Train One Million Teachers in AI by 2027

The $30 Trillion Blank Slate: Inside the Scientific Blueprint to Build a Net-Zero India from Scratch

Can AI Detect Hate Speech in Local Languages? This Groundbreaking Research Says Yes

Nallah Maar – A requiem for Srinagar’s buried stream

IIT Madras-Backed Initiative launched to Train One Million Teachers in AI by 2027

The $30 Trillion Blank Slate: Inside the Scientific Blueprint to Build a Net-Zero India from Scratch