A CALIBRATION-AWARE, SHAP-GROUNDED AUDIT OF CROSS-CORPUS MISINFORMATION CLASSIFICATION ON SOCIAL MEDIA

Authors

  • Dr. Musarat Karim Author
  • Dr. Mustafa Hameed Author
  • Ms. Alisha Fida Author

Keywords:

misinformation detection, fake news, social media, cross-domain transfer, calibration, explainable AI, TreeSHAP, shortcut learning, TF-IDF, LightGBM, domain shift, COVID-19 infodemic

Abstract

Automated misinformation detection is increasingly proposed as a front-line infrastructure for social media platforms, and the headline accuracy numbers look reassuring: a plain bag-of-words classifier clears AUC 0.97 on a COVID-19 post corpus. We ask the question that those numbers leave open: does a detector trained on one stream of misinformation work on the next? Three public corpora that circulated on social platforms were used to test the model: Constraint (10,144 COVID-19 posts after cleaning), LIAR (12,783 PolitiFact political claims), and GossipCop (20,360 celebrity-gossip headlines), unified to a binary real/fake target. Four classical models (Multinomial Naive Bayes, Logistic Regression, a calibrated Linear SVM and LightGBM) over a TF-IDF representation were evaluated within each corpus with 5-fold stratified cross-validation and bootstrap 95 % CIs, then moved across all nine train/test corpus pairs, and finally explained with TreeSHAP. The within-corpus picture matches the literature: AUC 0.97-0.98 on Constraint, 0.84-0.88 on GossipCop, and a hard 0.62-0.66 on LIAR. The transfer picture does not survive the contact with a second corpus. The off-diagonal AUC collapses toward chance ; averaged over the six directed transfers, it is 0.57, a mean drop of 0.26 from the matching within-corpus score, and pooling two corpora to predict the third (leave-one-corpus-out) does not rescue it. Calibration degrades even harder than discrimination: expected calibration error inflates several-fold under shift, worst where the class prior moves most (a balanced detector scoring 24 % fake GossipCop), though a 200-row target recalibration repairs most of it (mean ECE 0.34 → 0.04). TreeSHAP explains why: the top-weighted tokens are pandemic terms, political actors, and celebrity names ; topic and source markers, not credibility cues ; and the top-50 SHAP lexicons barely overlap across corpora (Jaccard 0.06-0.14) against a within-corpus fold-stability baseline three to five times higher. The detectors learn the topic, not the truth.

Downloads

Published

15-06-2026

How to Cite

A CALIBRATION-AWARE, SHAP-GROUNDED AUDIT OF CROSS-CORPUS MISINFORMATION CLASSIFICATION ON SOCIAL MEDIA. (2026). Journal of Media Horizons, 7(6), 106-123. https://jmhorizons.com/index.php/journal/article/view/1620