Can AI Spot Deepfakes? FactGuard’s New Approach to Video Truth

Author: Denis Avetisyan

A new agentic framework uses artificial intelligence to actively investigate and verify the authenticity of video content, moving beyond passive detection.

FactGuard distinguishes itself from existing video misinformation detection methods through both enhanced explainability and a demonstrated improvement in overall performance metrics.

FactGuard employs multimodal large language models and reinforcement learning to iteratively reason, acquire external evidence, and adapt to uncertainty in video misinformation detection.

Despite advances in multimodal reasoning, large language models often struggle with nuanced video misinformation detection, particularly when evidence is fragmented or requires external validation. To address this, we introduce ‘FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning’, a novel framework that formulates verification as an iterative, agentic process. FactGuard leverages reinforcement learning to optimize tool usage and selectively acquire external evidence, enabling progressive refinement of reasoning and improved decision-making under uncertainty. Could this approach pave the way for more robust and adaptable systems capable of discerning truth from falsehood in the ever-evolving landscape of online video content?

The Erosion of Verifiable Reality

The rapid increase in digitally altered and deliberately misleading videos presents a growing crisis for informed public discourse. This proliferation of “deepfakes” and other forms of video manipulation erodes trust in visual evidence, once considered a relatively reliable source of information. Beyond the potential for immediate deception, the widespread availability of these fabricated realities cultivates a climate of uncertainty, where distinguishing between genuine events and constructed narratives becomes increasingly difficult. This breakdown of trust extends beyond specific instances of misinformation, impacting faith in institutions, media, and even the veracity of shared experiences, ultimately threatening the foundations of a well-informed society and hindering constructive dialogue on critical issues.

Current automated systems for identifying video misinformation frequently falter when confronted with subtleties that require deeper comprehension. These methods often rely on identifying superficial inconsistencies or pixel-level manipulations, proving inadequate when faced with videos that present misleading narratives through carefully selected footage or deceptive editing-content that appears technically sound but distorts the truth. A key limitation lies in their inability to perform the kind of contextual reasoning humans naturally employ; discerning the intent behind the video, verifying claims against external knowledge, and understanding the broader socio-political landscape are all crucial steps often missing from algorithmic analysis. Consequently, these systems struggle with satire, parody, or videos employing sophisticated rhetorical techniques, leading to both false accusations of misinformation and the failure to detect genuine deception.

Initial attempts to automatically identify video misinformation often prove unreliable, generating a substantial number of both false positives – incorrectly flagging authentic videos as fake – and false negatives – failing to detect genuinely manipulated content. This imprecision stems from the difficulty in discerning subtle manipulations or understanding the contextual nuances crucial for accurate assessment. Consequently, researchers are actively pursuing more sophisticated techniques that move beyond simple pixel-level analysis and incorporate reasoning capabilities, contextual understanding, and potentially, knowledge about the video’s origins and intended message, to drastically reduce these errors and build truly trustworthy detection systems.

Despite advancements in artificial intelligence, current state-of-the-art discriminative models – including BERT, TikTec, FANVM, SVFEND, and FakingRec – consistently demonstrate limitations when tasked with identifying manipulated videos. These models often rely on identifying superficial inconsistencies or artifacts, proving vulnerable to increasingly sophisticated forgery techniques that mimic natural video characteristics. The challenge lies in their inability to fully grasp contextual reasoning and nuanced visual cues, leading to frequent false positives – incorrectly flagging authentic videos as fake – and, more critically, false negatives where convincing misinformation goes undetected. This deficiency underscores the need for novel approaches that move beyond pixel-level analysis and incorporate a deeper understanding of video content and its real-world implications, as solely relying on these models leaves the public susceptible to deceptive media.

FactGuard improves misinformation verification by framing it as an adaptive, uncertainty-aware process that leverages external tools and avoids the cross-modal hallucination issues common in reasoning-based methods which can falsely treat internal assumptions as grounded evidence.

An Agentic Framework for Truth Ascertainment

FactGuard addresses video misinformation through an agentic framework, modeling verification not as a single assessment, but as a series of sequential decisions. This iterative process allows the system to refine its understanding and conclusions over multiple steps. Rather than passively receiving information, FactGuard actively formulates hypotheses, seeks relevant evidence to support or refute them, and updates its internal state based on the gathered information. Each iteration involves analyzing the video content, identifying claims, formulating queries for external knowledge sources, and integrating the retrieved evidence into a confidence-weighted assessment of the claim’s veracity. This cyclical approach allows FactGuard to progressively build a more robust and nuanced understanding of the video’s content and its factual accuracy.

FactGuard utilizes Multimodal Large Language Models (MLLMs) to process and integrate information from diverse input modalities, including video frames, audio transcripts, and associated text. These MLLMs, trained on extensive datasets of paired visual and textual data, enable the system to perform tasks such as object recognition, scene understanding, and speech-to-text conversion. Crucially, the models are not simply passively receiving data; they provide reasoning capabilities allowing for the identification of salient features, the detection of inconsistencies, and the synthesis of evidence across modalities. This robust multimodal understanding is fundamental to FactGuard’s ability to assess the veracity of video content and distinguish between factual and misleading information.

Agentic Reasoning within FactGuard enables proactive information gathering beyond initial video content analysis. Rather than passively receiving data, the system formulates search queries based on claims identified in the video, utilizing these queries to access external knowledge sources – including search engines and knowledge bases. Retrieved evidence is then analyzed by the Multimodal Large Language Model (MLLM) to corroborate or refute the initial claims. This iterative process of hypothesis generation, evidence retrieval, and evidence evaluation allows FactGuard to dynamically expand its understanding and move beyond the limitations of solely relying on the provided video and associated metadata, thereby enhancing verification accuracy.

FactGuard incorporates uncertainty awareness by quantifying confidence levels associated with each verification step and the final conclusion. This is achieved through the utilization of probabilistic reasoning and calibration techniques applied to the outputs of the Multimodal Large Language Model (MLLM). Specifically, the system doesn’t simply output a binary true/false assessment; instead, it provides a confidence score, representing the likelihood of the conclusion being correct, based on the evidence considered and the model’s internal estimations. These confidence scores are updated iteratively as the agent seeks and integrates additional evidence, allowing for a nuanced evaluation of the claim’s veracity and highlighting areas where further investigation is required. The system outputs these confidence scores alongside the verified claim, enabling users to assess the reliability of the information provided.

FactGuard employs an agentic verification pipeline-combining supervised fine-tuning and decision-aware reinforcement learning-to assess input ambiguity, selectively utilize external tools, and refine reasoning for calibrated, risk-sensitive decision-making.

Constructing a Verifiable Chain of Evidence

FactGuard integrates external tools, specifically FactProbe and ClipScout, to bolster evidence acquisition during the verification process. FactProbe functions as a knowledge retrieval system, querying a database of verified claims and related information to identify supporting or contradicting evidence. ClipScout performs visual analysis, identifying potentially manipulated or out-of-context video segments by comparing them against a vast library of known media. These tools allow FactGuard to move beyond internal knowledge and access a wider range of data sources, enhancing the robustness and comprehensiveness of its evidence base and enabling verification of claims relying on visual content.

Chain-of-Thought (CoT) prompting is a technique used to enhance the reasoning capabilities of large multimodal language models (MLLMs) by explicitly eliciting a series of intermediate reasoning steps before arriving at a final conclusion. Instead of directly requesting a determination regarding video misinformation, the MLLM is prompted to articulate the rationale behind its assessment, detailing each step in the evaluation process. This approach improves transparency by making the model’s decision-making process observable and allows for error analysis at each stage. Furthermore, by forcing the MLLM to decompose the problem into smaller, more manageable steps, CoT prompting has been shown to improve the overall reliability and accuracy of its judgments, as it reduces the likelihood of relying on spurious correlations or superficial patterns.

Supervised Fine-Tuning (SFT) is a crucial component of FactGuard, involving the training of a Multimodal Large Language Model (MLLM) on a dataset of labeled video misinformation examples. This process adapts the pre-trained MLLM to specifically recognize patterns and indicators associated with false or misleading video content. The labeled dataset includes examples of both misinformation and factual videos, allowing the MLLM to learn discriminative features. SFT optimizes the MLLM’s parameters to maximize its accuracy in identifying misinformation, improving performance beyond what is achievable with zero-shot or few-shot prompting alone. The resulting model is better equipped to analyze video content, extract relevant information, and make informed judgments about its veracity.

FactGuard’s Iterative Decision-Making process involves a cyclical refinement of initial assessments based on incoming data and evaluative feedback. Following an initial claim evaluation, the system actively seeks supplementary evidence via external tools. This acquired information is then integrated into the reasoning process, potentially modifying the original determination. Subsequently, the system’s output is subject to review, and any discrepancies or inaccuracies identified during evaluation are used to update internal parameters and improve future performance; this feedback loop ensures continuous adaptation and enhances the accuracy of misinformation detection over time.

FactGuard demonstrates improved reasoning interpretability by generating more coherent and evidence-grounded traces compared to Qwen2.5-VL and Fact-R1 when making correct predictions.

Adaptive Learning for Robust Verification

FactGuard utilizes Reinforcement Learning (RL) to dynamically refine its video verification policy. This approach moves beyond static rule-based systems by allowing the system to learn optimal verification strategies through interaction with data. The RL agent receives feedback – rewards or penalties – based on the accuracy of its verification decisions, iteratively adjusting its policy to maximize rewards and minimize errors. This adaptive process enables FactGuard to improve its performance over time and generalize effectively to new and unseen video content, addressing the challenges posed by evolving forgery techniques and diverse video characteristics.

Group Relative Policy Optimization (GRPO) is a policy gradient reinforcement learning algorithm selected for its efficiency in high-dimensional action spaces and its ability to handle non-stationary policies. GRPO operates by optimizing the policy for each group of similar states, rather than globally, which improves sample efficiency and stability during training. The algorithm calculates the advantage function for each group based on the observed rewards and uses this to update the policy parameters via a trust region method. This approach allows FactGuard’s verification agent to learn a robust policy for identifying manipulated videos by focusing on local improvements to the policy within defined state groupings, resulting in faster convergence and enhanced performance.

FactGuard’s reinforcement learning agent is trained using a cost function that differentiates between the penalties associated with false positives and false negatives. This approach, termed Asymmetric Error Costs, recognizes that incorrectly flagging authentic video as manipulated (a false positive) carries different implications than failing to detect manipulated video (a false negative). Specifically, the system assigns a higher cost to false negatives, prioritizing the accurate identification of manipulated content even at the expense of a potentially increased false positive rate. This weighting is implemented directly within the reward function used to train the GRPO agent, effectively guiding the model to minimize the more severe error type during policy optimization.

FactGuard’s performance was assessed using established benchmark datasets for video verification, specifically FakeSV, FakeTT, and FakeVV. Evaluations on these datasets demonstrate that FactGuard achieves a statistically significant improvement in accuracy compared to previously published methods in the field. Quantitative results indicate a reduction in both false positive and false negative rates across all three datasets, confirming the effectiveness of the implemented reinforcement learning approach and asymmetric error cost function. Detailed performance metrics, including precision, recall, and F1-score, are reported for each dataset to facilitate direct comparison with existing literature.

FactGuard utilizes a two-turn prompting strategy to verify the factual consistency of generated text.

A Paradigm Shift in Trustworthy Media

FactGuard represents a significant advancement in the detection of video misinformation, consistently achieving state-of-the-art results when compared to existing discriminative models. Rigorous evaluation on key benchmarks demonstrates its superior performance over systems like GPT-4o and Fact-R1, indicating a heightened capacity to accurately identify manipulated or false content within video formats. This improved accuracy isn’t merely incremental; it establishes a new standard for automated fact-checking in video, offering a more reliable defense against the spread of false narratives and bolstering efforts to maintain informational integrity online. The consistent outperformance suggests a robust architecture capable of discerning subtle cues indicative of misinformation, exceeding the capabilities of previously established models.

The architecture underpinning FactGuard extends beyond simple detection, leveraging an agentic framework designed to proactively address evolving misinformation tactics. This system doesn’t merely identify false claims; it reasons through complex scenarios, adapting its verification strategies as manipulation techniques become more nuanced. By simulating a deliberate investigation – formulating hypotheses, seeking corroborating evidence, and evaluating source credibility – the framework mimics human critical thinking. This capacity for dynamic adaptation is crucial, as manipulators continually refine their methods to bypass conventional detection systems; FactGuard’s agentic design anticipates and counters these changes, promising a robust defense against increasingly sophisticated disinformation campaigns and fostering a more resilient information ecosystem.

Independent assessment by GPT-4o reveals that FactGuard surpasses both Qwen2.5-VL and Fact-R1 in the critical area of reasoning performance. This isn’t merely about identifying factual inaccuracies, but demonstrating a capacity for coherent inference – the ability to connect evidence and arrive at logically sound conclusions. FactGuard doesn’t simply flag misinformation; it exhibits evidence-grounded reasoning, meaning its judgments are directly supported by the information it processes. This advanced capability allows the system to navigate complex narratives and discern subtle manipulations with greater accuracy, representing a significant step forward in the detection of sophisticated disinformation campaigns.

The efficacy of FactGuard’s multi-step verification process is fundamentally reliant on its reinforcement learning component. Studies reveal a substantial performance decline when this element is removed, indicating it’s not merely additive, but integral to the system’s reasoning capabilities. This suggests that identifying and debunking video misinformation demands a dynamic approach, where the agent learns to strategically navigate complex evidence and refine its decision-making process through iterative feedback. Reinforcement learning optimizes the sequence of verification steps, allowing FactGuard to prioritize crucial evidence and avoid being misled by superficial manipulations – a capability significantly diminished without this adaptive learning mechanism. Consequently, the technology’s success in combating sophisticated disinformation hinges on its ability to learn and refine its verification strategies through this critical component.

The development of FactGuard represents a significant step towards restoring confidence in the digital information landscape. By providing a robust system for verifying the authenticity of online videos, this technology aims to equip individuals with the tools necessary to discern truth from falsehood. This isn’t simply about flagging misinformation; it’s about fostering a more informed citizenry capable of making sound judgements based on verified evidence. As online manipulation becomes increasingly sophisticated, FactGuard’s agentic framework offers a proactive defense, empowering users to navigate the complexities of the internet with greater assurance and ultimately, to participate more effectively in a world reliant on digital communication.

This case study demonstrates the effectiveness of FactGuard in a real-world scenario.

The pursuit of verifiable truth, as embodied in FactGuard, aligns with a fundamental principle of logical construction. The agentic framework detailed in the paper isn’t merely about achieving a functional outcome – identifying misinformation – but about establishing a provable process. This echoes Bertrand Russell’s assertion that “The whole problem with the world is that fools and fanatics are so confident in their errors.” FactGuard, through its iterative reasoning and evidence acquisition, attempts to minimize those errors, building a robust defense against deceptive content by prioritizing logical completeness over superficial accuracy. The system’s adaptation to uncertainty isn’t simply a technical feature; it’s an acknowledgement that absolute certainty is often unattainable, and a commitment to refining the probabilistic assessment of truth.

What Lies Ahead?

The architecture presented in FactGuard, while demonstrating a capacity for iterative refinement in the face of deceptive video content, merely scratches the surface of a fundamentally intractable problem. The pursuit of ‘truth’ via algorithmic means exposes a critical limitation: the reliance on externally sourced evidence, itself susceptible to the same biases and manipulations inherent in the initial misinformation. The elegance of the agentic framework lies not in its ability to find truth, but in the consistent application of its boundaries-a provable logic for assessing uncertainty, even when absolute verification remains elusive.

Future work must move beyond the acquisition of evidence and focus on the formalization of evidentiary weight. A mere tally of supporting or refuting claims is insufficient. The system needs a mathematically rigorous method for evaluating the provenance, reliability, and internal consistency of each piece of information. The current reinforcement learning approach, while functional, feels empirically driven. A formal verification of the reward function-ensuring it truly incentivizes accurate assessment, and not simply the accumulation of confirming data-remains a significant challenge.

Ultimately, the problem isn’t one of scaling current multimodal models, but of defining a logically sound epistemology for artificial intelligence. The pursuit of ‘misinformation detection’ may prove to be a category error; perhaps the real objective should be the construction of systems capable of identifying and explicitly quantifying the limits of their own knowledge.

Original article: https://arxiv.org/pdf/2602.22963.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/