Faster Image Generation with Entropy-Guided Sparsity
![The method optimizes video compression via tri-dimensional entropy-aware sparsity, beginning with a scale-level analysis to compute a low-entropy ratio [latex]\rho_s[/latex] and establish a pruning threshold τ, then proceeding to a layer-level decomposition using Singular Value Decomposition on entropy maps to distinguish global and detail layers, and finally applying entropy-based gating [latex]p_{prune}[/latex] at the token level to selectively remove low-salience elements while preserving important regions, all dynamically adjusted with scale to enhance compression efficiency.](https://arxiv.org/html/2602.22948v1/2602.22948v1/x4.png)
A new framework intelligently reduces computational load in image generation models by focusing on the most semantically important details.
![The method optimizes video compression via tri-dimensional entropy-aware sparsity, beginning with a scale-level analysis to compute a low-entropy ratio [latex]\rho_s[/latex] and establish a pruning threshold τ, then proceeding to a layer-level decomposition using Singular Value Decomposition on entropy maps to distinguish global and detail layers, and finally applying entropy-based gating [latex]p_{prune}[/latex] at the token level to selectively remove low-salience elements while preserving important regions, all dynamically adjusted with scale to enhance compression efficiency.](https://arxiv.org/html/2602.22948v1/2602.22948v1/x4.png)
A new framework intelligently reduces computational load in image generation models by focusing on the most semantically important details.

A new framework intelligently filters input data and refines neural network architecture to improve the efficiency of wireless communication systems.

Researchers are now leveraging vast online datasets to build more effective cybersecurity training models.
![A generalized planning pipeline leverages learned transition models to navigate symbolic states, encoding state-goal pairs into fixed-dimensional embeddings-using either graph kernels or factored vectors-and predicting successor embeddings via parametric (LSTM) or non-parametric (XGBoost) models; this enables the selection of executable actions by matching predicted embeddings to valid symbolic successors, guaranteeing both symbolic validity and generalization capabilities beyond the training data-as formalized by [latex]\Delta\_{t}[/latex] representing residual state transitions.](https://arxiv.org/html/2602.23148v1/2602.23148v1/figures/state-centric-arch.png)
A new approach to artificial intelligence focuses on learning to predict how environments change, rather than focusing on the actions that cause those changes, leading to more adaptable and efficient planning.

A new deep learning approach automatically analyzes 3D CT scans, focusing on key organ regions to enhance the accuracy of renal cancer malignancy prediction.
![A fine-tuned Whisper model facilitates both speech-to-text transcription and the identification of synthetically generated words, with special tokens [latex]\langle TOF \rangle[/latex] and [latex]\langle EOF \rangle[/latex] demarcating the boundaries of these artificial lexical units.](https://arxiv.org/html/2602.22658v1/2602.22658v1/x1.png)
Researchers are leveraging speech recognition technology to identify synthetically generated words within audio recordings.
![Through a three-part study examining human and machine approaches to graph comparison, research demonstrates that multimodal large language models (MLLMs) more closely align with human judgment than traditional computational measures-a result substantiated by both perceptual alignment and the models’ ability to provide interpretable reasoning for their assessments, ultimately positioning them as effective tools for assisting human analysts in this complex task-as assessed through investigations into indirect similarity measurements, pairwise comparisons utilizing sixteen distinct computational methods, and a relative evaluation of MLLM capabilities [latex]RQ_1[/latex], [latex]RQ_2[/latex], [latex]RQ_3[/latex].](https://arxiv.org/html/2602.22416v1/2602.22416v1/figs/teaser.png)
New research benchmarks how well AI models can judge the similarity of graph visualizations, mirroring human visual perception.

A new approach to artificial intelligence focuses on building agents that actively seek out and retain knowledge, enhancing their performance over time without traditional parameter updates.

Researchers have created a benchmark to rigorously evaluate how well AI agents can maintain context and reason over extended interactions.
![A method embeds content-related information into images via quantized multi-scale tokens-created with a VQ-VAE and constrained by watermark capacity [latex] |h| \leq |m| [/latex]-enabling recovery of deepfakes even after malicious manipulations like object removal or inpainting, achieved through decoding the watermarked image to extract hidden tokens and generate a deepfake localization map [latex] M_{loc} [/latex].](https://arxiv.org/html/2602.22759v1/2602.22759v1/picture/workflow.png)
Researchers have developed a novel watermarking technique that embeds a multi-scale ‘fingerprint’ within images, enabling both deepfake detection and faithful recovery of original content.