AI-DRIVEN FACT-CHECKING AND DISINFORMATION: OPPORTUNITIES, LIMITATIONS, AND IMPLEMENTATION IN NEWSROOMS
Abstract
AI-generated texts and very fast platforms put a strain on newsroom verification. We introduce a guardrailed provenance-first fact-checking system integrating sparse-dense hybrid retrieval, 360-ranks cross-encoder, calibrated veracity, and human-in-the-loop review. It uses source whitelisting, time filters, adversarial defenses, and citation integrity (URL, timestamp, snippet) and presents uncertainty and rationale summaries with a CMS plugin. The system with a temporally held-out newsroom test set (N=1,500 claims) in evaluation gives Micro-F1 0.82, Macro-F1 0.79, AUROC 0.89, NDCG 10 0.82, p95 latency 3.6 s, hallucination rate 3.7 and citation correctness 95.8.
AI assistance in a within-subjects newsroom pilot (N=32) led to a decrease in task time of about 29%, a decrease in workload and an increase in the proper use of it, with desk-level throughput improvement ( +82% claims/hour; +60% stories/day) and a reduction in the number of escalations. Error analysis indicates proximity errors and lack of evidence in breaking news. We talk about governance, auditability and safe learning through the feedback of the editors. Findings reveal that AI complements, but does not substitute, editorial judgment with accuracy.
Keywords: Fact-Checking; Disinformation; Retrieval-Augmented Generation (RAG); Provenance; Hybrid Retrieval; Human-In-The-Loop; Calibration And Trust.