Science – Page 25

The AI Performance Plateau: Why Benchmarks Are Losing Their Edge

21.02.2026 by qfx

As benchmarks age, performance compression among state-of-the-art models increases, leading to a demonstrable rise in mean saturation index-a trend evidenced by older benchmarks exhibiting significantly higher average saturation values, with standard deviation represented within each time bin.

A new study reveals that the rapid gains seen in artificial intelligence are increasingly limited by benchmark saturation, demanding a rethink of how we measure progress.

Beyond Neural Networks: A New Logic for Graph Data

21.02.2026 by qfx

The SymGraph framework offers a systematic approach to analyzing the symmetries inherent in complex systems, enabling researchers to identify conserved quantities and simplify modeling efforts-a pragmatic acknowledgement that even the most elegant theoretical constructs will ultimately face the realities of implementation and unforeseen edge cases.

Researchers are exploring symbolic reasoning as a powerful alternative to message passing in graph neural networks, offering improved expressiveness and interpretability.

Can Code Comments Hide Security Flaws from AI?

21.02.2026 by qfx

New research reveals how subtly crafted comments can sometimes mislead AI-powered code review tools, but also highlights a powerful solution.

Can AI Close the Deal? A New Benchmark for Sales Research

20.02.2026 by qfx

The Dynamics 365 Sales Hub incorporates a Sales Research Agent to facilitate informed decision-making within the sales process.

A novel evaluation framework assesses the ability of artificial intelligence to perform effective sales research, revealing significant performance variations between leading models.

Unlocking Insights from Connected Data: A New Theory for Graph-Based Learning

20.02.2026 by qfx

$The study demonstrates that mean squared error, assessed across twenty trials, decreases consistently with increasing [latex]\log(1/\pi)[/latex] for various graph sizes, with performance distinctions observed between graph convolutional networks-both with and without skip connections-and a multilayer perceptron baseline.$

A new theoretical framework clarifies how semi-supervised learning on graphs can effectively leverage network structure with limited labeled data.

Spotting the Fake: New AI Targets Deepfake Videos

20.02.2026 by qfx

The EA-Swin architecture details a novel approach to efficiently processing data through a hierarchical structure, leveraging shifted windows to capture both local and global dependencies within the input.

Researchers have developed a novel architecture and large-scale dataset to more reliably detect increasingly sophisticated AI-generated video content.

Beyond Actions: Gauging true Financial Guidance with Conversational Benchmarks

20.02.2026 by qfx

The Conv-FinRe benchmark establishes a framework for evaluating financial forecasting models through convolutional neural networks, anticipating inevitable systemic frailties inherent in any predictive ecosystem.

A new dataset and evaluation framework, Conv-FinRe, assesses financial recommendation systems by examining alignment with a user’s long-term goals, rather than simply mirroring their past behavior.

Can AI Handle Your Finances? A New Test for Number Skills

20.02.2026 by qfx

The BankMathBench pipeline establishes a rigorous framework for automated question generation, solution derivation, and reasoning articulation in financial mathematics, ensuring each step-from problem creation to verifiable answer and supporting rationale-is systematically produced and validated.

Researchers have created a challenging benchmark to assess how well artificial intelligence can reason with numbers in real-world banking tasks.

Can AI Crack Graph Theory’s Toughest Problems?

20.02.2026 by qfx

The Deep Cross-Entropy agent successfully disproved Conjecture 3.6, achieving its best score with a specific step count and demonstrating this through two counterexamples of order 1616.

A new reinforcement learning framework is proving adept at challenging long-held beliefs in extremal graph theory and constructing counterexamples to established results.

Beyond Mean-Variance: AI Takes the Reins of Portfolio Management

20.02.2026 by qfx

Modern portfolio optimization, traditionally reliant on mean-variance optimization [latex]MVO[/latex], is shown to be outperformed by a reinforcement learning approach [latex]DRL[/latex] during backtesting, suggesting that algorithms mirroring human adaptability-even with inherent imperfections-can navigate market fluctuations more effectively than static, mathematically idealized models.

A new study reveals that deep reinforcement learning consistently delivers superior portfolio performance compared to traditional optimization techniques.