Spotting the Stitch: How Well Do Deepfake Detectors See Localized Edits?

Author: Denis Avetisyan


As AI image generation becomes more sophisticated, existing detection methods are being challenged by subtle, localized manipulations like inpainting.

The DINOv2 model demonstrates successful image inpainting-reconstructing missing regions based on context-in certain scenarios, as evidenced by accurate reconstructions despite masking, yet struggles with more complex or ambiguous cases, highlighting the limitations of the current approach.
The DINOv2 model demonstrates successful image inpainting-reconstructing missing regions based on context-in certain scenarios, as evidenced by accurate reconstructions despite masking, yet struggles with more complex or ambiguous cases, highlighting the limitations of the current approach.

This review assesses the performance of current synthetic image detectors when identifying small-scale image alterations, revealing limitations in detecting finely-tuned inpainting techniques.

Despite advances in generative AI detection, current methods often struggle with nuanced image manipulations beyond full synthetic creations. This work, ‘Detecting Localized Deepfakes: How Well Do Synthetic Image Detectors Handle Inpainting?’, systematically evaluates the performance of state-of-the-art deepfake detectors when applied to the challenge of identifying localized inpainting-a common technique for subtle image alteration. Our findings demonstrate that models trained on broad datasets exhibit partial transferability, reliably detecting medium- to large-scale inpainting, but struggle with smaller, more refined edits. As increasingly realistic localized manipulations become prevalent, how can we enhance detection methods to address these emerging threats to image authenticity?


The Shifting Landscape of Visual Authenticity

The landscape of visual content is undergoing a dramatic shift, fueled by advancements in generative artificial intelligence. Models like Adobe’s Firefly and open-source Diffusion Models are no longer limited to producing simple images; they now synthesize remarkably realistic and detailed visuals from textual descriptions. These models function by learning the underlying patterns and structures within vast datasets of images, enabling them to generate novel content that closely mimics photographs, paintings, and other artistic styles. The speed of innovation is particularly noteworthy; each successive generation of these models exhibits a significant leap in quality, resolution, and creative control, blurring the lines between human and machine creation. This capability extends beyond simple image fabrication, allowing for complex editing, inpainting, and the seamless blending of different visual elements, promising a future where generating bespoke imagery is readily accessible to all.

The escalating creation of photorealistic imagery through artificial intelligence presents a significant challenge to discerning authentic content from synthetic creations. As AI-generated visuals become increasingly pervasive, the need for reliable AI-Generated Content Detection systems grows critical to mitigate the spread of misinformation and protect the integrity of information ecosystems. These detection methods aren’t simply about identifying ‘fakes’ – they are essential for upholding trust in visual media, safeguarding intellectual property, and ensuring accountability in a world where anyone can potentially fabricate convincing imagery. Robust detection capabilities are therefore becoming indispensable tools for journalists, social media platforms, and individuals alike, fostering a more informed and trustworthy digital landscape.

Current methods for identifying AI-generated images face significant hurdles in maintaining reliable performance. The very techniques used to create these images are constantly being refined; as generative models become more sophisticated, detection algorithms trained on older outputs quickly become obsolete. This challenge is further compounded by standard image processing procedures like JPEG compression, which introduce artifacts that can mimic the subtle patterns left by AI generation, leading to false positives. Consequently, a detector effective on one dataset or against a specific model often fails when applied to images created with a different technique or even a slightly altered compression setting, highlighting the need for detection systems that are robust to both evolving AI and common image manipulations.

Representative generative models exhibit similar distributions of masked area percentages, with groupings indicating models sharing identical distributions in the number of images per bin.
Representative generative models exhibit similar distributions of masked area percentages, with groupings indicating models sharing identical distributions in the number of images per bin.

Self-Supervised Learning: A Pathway to Robust Feature Extraction

Self-Supervised Learning (SSL) offers a training paradigm for vision transformers that circumvents the need for large, manually labeled datasets. Instead of relying on external annotations, SSL algorithms generate pseudo-labels from the data itself, enabling the model to learn representations by solving pretext tasks such as predicting image rotations or color distortions. Models like DINOv2 and DINOv3 utilize this approach, specifically employing knowledge distillation techniques to train a student network to match the output of a teacher network, both trained on unlabeled image data. This allows for the creation of robust feature extractors capable of generalization without the limitations and costs associated with traditional supervised learning methods that require extensive human annotation.

Self-supervised learning models, such as DINOv2 and DINOv3, extract features from unlabeled image data by creating pretext tasks – for example, predicting image rotations or color distortions – which forces the network to learn robust representations of visual content. These learned features are transferable because they capture fundamental image characteristics independent of specific labeled categories, allowing the model to generalize effectively to downstream tasks like AI-generated content detection. The richness of these features enables the discrimination of subtle artifacts and inconsistencies often present in AI-generated images, which may not be readily apparent to models trained solely on labeled data. Consequently, the models can more accurately distinguish between authentic images and those produced by generative algorithms.

AI-Generated Content Detection systems benefit from a standardized evaluation framework facilitated by the integration of DINOv2 and DINOv3 models with the AI-GenBench benchmark. This combination allows for quantitative assessment of detection performance using metrics such as Area Under the Receiver Operating Characteristic curve (AUROC). Specifically, the DINOv3 ViT-L/16 model has demonstrated a performance level of 0.86 AUROC when evaluated on the BR-Gen dataset, providing a concrete point of reference for comparing different detection methodologies and tracking advancements in the field.

Different image inpainting generators produce varying results, as shown in the examples.
Different image inpainting generators produce varying results, as shown in the examples.

Validating Detection Through Diverse Dataset Analysis

The TGIF Dataset, TGIF-2 Dataset, and BR-Gen Dataset are established resources for quantitatively assessing AI-Generated Content Detection methods. These datasets are specifically designed to provide a standardized basis for comparison across different detection algorithms and techniques. The TGIF and TGIF-2 datasets offer a range of real-world image manipulations, while the BR-Gen dataset focuses on systematically generated content with controlled variations in manipulation type and extent. Utilizing these datasets allows researchers to objectively measure detector performance, identify strengths and weaknesses, and track progress in the field of AI-generated content detection.

The TGIF, TGIF-2, and BR-Gen datasets are constructed with diverse image manipulations to specifically test the robustness of AI-generated content detection algorithms. These manipulations encompass image inpainting, where portions of an image are reconstructed; localized editing, involving targeted alterations to specific regions; and full regeneration, where substantial portions or the entirety of an image are recreated. The inclusion of these varied techniques presents a challenge to detection methods, requiring them to identify subtle artifacts or inconsistencies introduced during the generative process, rather than relying on easily detectable global features. Datasets with a wide range of manipulation types are crucial for evaluating a detector’s ability to generalize beyond specific attack scenarios.

Current state-of-the-art pre-trained detectors, specifically DINOv3 ViT-L/16 and DINOv2 ViT-L/14, achieve an Area Under the Receiver Operating Characteristic curve (AUROC) of 0.86 and 0.83 respectively when evaluated on the BR-Gen dataset, indicating strong performance in identifying localized inpainting. Detection efficacy is correlated with the size of the masked region, with AUROC scores consistently exceeding 0.65 for areas larger than 50% of the image. However, performance significantly degrades when assessing fully regenerated images, particularly those with small masked regions – less than 20% – where detection accuracy falls below 0.5, suggesting a limitation in identifying comprehensive generative manipulations.

Increasing the size of masked image regions correlates with greater detectability of resulting artifacts.
Increasing the size of masked image regions correlates with greater detectability of resulting artifacts.

Charting the Course: Enhancing Resilience and Adaptability

As generative models, particularly those leveraging techniques like Flux Models for inpainting, achieve ever-increasing realism, the challenge of distinguishing between authentic and synthetic content intensifies. These advanced inpainting methods excel at seamlessly reconstructing missing or altered portions of images, making even subtle manipulations incredibly difficult to detect with traditional approaches. Consequently, the pursuit of more sophisticated detection methods becomes paramount; these must move beyond pixel-level comparisons and instead focus on identifying statistical anomalies or inconsistencies in the underlying data distribution that betray the presence of AI-generated content. Future research will likely involve developing detectors capable of analyzing higher-order features, contextual relationships, and even the “fingerprints” left by specific generative algorithms, in a continuous arms race against increasingly convincing forgeries.

Detection models, often trained on specific generative techniques, frequently struggle when faced with novel AI-generated content. Transfer learning offers a powerful solution by leveraging knowledge gained from previously encountered techniques and applying it to new, unseen methods. This process involves fine-tuning a pre-trained detection model with a limited dataset from the novel generator, significantly accelerating learning and improving generalization ability. Rather than requiring extensive retraining from scratch, transfer learning allows models to adapt quickly, maintaining a higher degree of accuracy and robustness against the ever-evolving landscape of AI-generated content. This adaptability is crucial, as generative models continue to advance, producing increasingly realistic and sophisticated outputs that challenge existing detection methods.

Successfully countering the evolving threat of AI-generated content demands a multi-faceted defense, extending beyond singular advancements in detection. A truly robust system requires the concurrent development of three core components: feature extraction capable of identifying subtle inconsistencies indicative of manipulation, training on extraordinarily diverse datasets that encompass a wide range of generative techniques and content styles, and the implementation of adaptive learning strategies. These strategies allow detection models to continuously refine their understanding of realistic versus synthetic content, effectively bridging the gap as generative AI becomes increasingly sophisticated. This holistic approach isn’t merely about reacting to new threats, but proactively building a resilient framework capable of anticipating and neutralizing future advancements in AI-driven content creation.

The study’s findings illuminate a critical nuance in deepfake detection – a sensitivity to scale. Detectors, while robust against extensive manipulations, falter when confronted with subtle inpainting. This echoes Geoffrey Hinton’s observation: “The problem with deep learning is that it scales badly.” The research demonstrates this principle directly; as the scale of alteration diminishes, the efficacy of current detection methods erodes. Beauty scales-clutter doesn’t, and in this context, ‘clutter’ represents the noise introduced by imperceptible edits that confound even sophisticated algorithms. Refactoring detection methods – editing, not rebuilding – becomes essential to address this limitation and refine the pursuit of accurate image forensics.

What’s Next?

The current landscape of synthetic content detection reveals a curious asymmetry. Detectors sing when confronted with broad strokes – the obvious graft, the wholesale fabrication. But the subtle whisper of inpainting, the pixel-level adjustments meant to mimic plausible imperfection, often slips through unnoticed. This isn’t a failure of technology, perhaps, but a testament to the deceptive power of nuance. The interface strains when asked to distinguish between genuine degradation and meticulously crafted illusion.

Future work must move beyond simply identifying that something is altered, and instead address how it has been altered. A focus on the fingerprints left by specific inpainting algorithms – the harmonic distortions in frequency space, the telltale patterns of noise – could provide a more robust signal. Transfer learning offers a path, yet the current reliance on generalized detectors suggests a need for more specialized models, trained on the specific signatures of localized manipulation.

Ultimately, the pursuit of perfect detection feels increasingly like chasing a phantom. As generative models become more sophisticated, the very notion of “real” and “fake” blurs. The true challenge lies not in building impenetrable defenses, but in fostering a critical awareness – a cultivated skepticism that recognizes the inherent fallibility of any digital image, and understands that every detail, however insignificant, holds a potential story.


Original article: https://arxiv.org/pdf/2512.16688.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2025-12-20 09:59