Author: Denis Avetisyan
New research confirms a theoretical link between the distribution of trade sizes and the long-term patterns observed in financial markets.

This study validates the Lillo-Mike-Farmer theory by demonstrating a relationship between metaorder lengths and long-range autocorrelation in market order flow using both synthetic and empirical data.
Long-range correlations in financial markets remain a persistent puzzle, often attributed to hidden order splitting behaviours. This paper, ‘Metaorder modelling and identification from public data’, addresses a key limitation in validating the influential Lillo-Mike-Farmer (LMF) theory – the historical reliance on proprietary datasets. By leveraging recently developed methods for reconstructing synthetic metaorders, we demonstrate empirical validation of the LMF theory using publicly available Johannesburg Stock Exchange (JSE) data, finding consistency with power-law distributed metaorder lengths. Could this approach unlock broader, more reproducible analyses of order flow dynamics across diverse financial markets?
Unveiling the Hidden Architecture of Markets
Conventional market assessments frequently operate on aggregated data, averaging the actions of countless traders and inadvertently smoothing over the subtle, yet critical, behaviors of individuals. This approach risks overlooking the nuanced interplay of informed and uninformed participants, the impact of high-frequency trading strategies, and the propagation of information through order books. By focusing on broad indicators, traditional analysis can miss the ‘microstructure’ of the market – the detailed sequence of trades and order placements that reveals how actual transactions occur. This granular level of observation is increasingly recognized as essential, as it allows for a more accurate depiction of market dynamics and a deeper understanding of price formation processes, ultimately highlighting the limitations of models built on overly simplified assumptions about trader behavior.
Accurate financial modeling and prediction hinge on a deep understanding of order flow – the detailed sequence of buy and sell orders that constitute market activity. Traditional analyses often aggregate this data, obscuring the subtle, yet crucial, behaviors of individual traders and the strategies they employ. By examining how traders actually interact – the size of their orders, the timing of execution, and the patterns of their activity – researchers gain insights into market sentiment, the presence of informed trading, and potential price movements. This granular approach moves beyond simply observing what is being traded to understand how traders are attempting to influence the market, enabling more sophisticated and reliable predictive models that account for the complexities of human behavior within financial ecosystems.
The conventional view of market activity often treats each trade as an isolated event, obscuring the fact that many originate from a single, strategic source. Researchers are increasingly focusing on ‘metaorders’ – defined as a discernible sequence of individual trades emanating from the same trader or algorithmic system – to reveal the hidden architecture of financial markets. By analyzing these metaorders, rather than isolated ticks, patterns emerge that would otherwise remain concealed. This approach allows for the identification of sophisticated trading strategies, the assessment of informed trading activity, and a more accurate estimation of potential market impact. The recognition of metaorders provides a more nuanced understanding of order flow, moving beyond simple volume or price changes to uncover the intentionality driving market movements and offering a powerful tool for predictive modeling.
The examination of metaorders – connected sequences of trades originating from a single source – reveals subtle but significant patterns that suggest the presence of informed trading activity. Researchers have discovered that these patterns often deviate from the behavior of uninformed traders, exhibiting characteristics like larger trade sizes, strategic timing around news events, and a tendency to initiate price movements. By carefully analyzing the characteristics of these metaorders – their volume, speed, and order placement – it becomes possible to infer whether a trader possesses private information and is attempting to profit from it. This insight is crucial, as the collective impact of informed traders’ metaorders can demonstrably influence market prices and volatility, offering a more nuanced understanding of market dynamics than traditional order book analysis alone.

The Persistence of Correlation: A Deeper Look
Analysis of market order flow consistently reveals statistically significant, long-range correlations that deviate from expectations based on random processes. These correlations indicate that patterns in order placement persist over extended timeframes – often spanning days, weeks, or even months – and are not attributable to short-term, independent trading decisions. Specifically, autocorrelation functions calculated from order flow data exhibit decay rates significantly slower than those predicted by models assuming randomness; the observed persistence is quantifiable using metrics like the Hurst exponent, which consistently exceeds 0.5, indicating long-term memory in the data. This suggests a degree of coordinated behavior or information transmission among market participants beyond what is explained by standard market models.
Empirical analysis of trade data consistently reveals long-range correlations in market order flow, indicating that trader actions are not statistically independent. Specifically, observed patterns demonstrate that the probability of a price change in a given direction remains correlated with past price movements over extended time horizons, far exceeding the expectations of a purely random process. This suggests a degree of coordination, or at least shared response mechanisms, among market participants, as individual traders are not acting in complete isolation. The persistence of these correlations has been documented across multiple asset classes and time scales, reinforcing the conclusion that observed market behavior is driven by factors beyond simple, independent decision-making.
The Long-Microstructure Frequency (LMF) Theory posits that observed long-range correlations in market order flow are significantly influenced by the practice of order-splitting. This involves institutional traders deliberately dividing large orders into numerous smaller, discrete orders to minimize market impact and execution costs. The resulting pattern of smaller orders, while appearing random individually, collectively introduces a predictable component to the overall order flow. Specifically, the LMF Theory suggests that these split orders exhibit a characteristic frequency related to the time it takes to execute the complete original order, leading to detectable correlations extending over considerable time horizons and contributing to the non-random behavior observed in market data.
Order splitting, the practice of executing large trades via numerous smaller orders, introduces a significant dynamic between informed traders and market microstructure. Informed traders utilize this technique to minimize market impact and conceal the full extent of their intentions, thereby reducing the price movement caused by their trades. However, the resulting fragmentation of order flow alters the observed characteristics of the market. This impacts price discovery, increases the complexity of order book dynamics, and contributes to the persistence of observed correlations in order flow. The interaction creates a feedback loop where the attempts of informed traders to strategically split orders are, in turn, interpreted by other market participants, influencing their own trading behavior and contributing to the long-range correlations seen in market data.

Modeling Market Realities: A Synthetic Approach
Synthetic metaorders, constructed programmatically to mimic real market activity, serve as a critical component in evaluating financial theories and testing algorithmic strategies. Unlike reliance on historical data, which may be limited or biased, metaorder generation allows researchers to create datasets reflecting specific, controlled conditions and parameter variations. This capability facilitates the isolation of key variables influencing market behavior, enabling rigorous hypothesis testing regarding price discovery, liquidity provision, and the impact of different order types. The use of synthetic data also circumvents issues of regulatory compliance and data privacy associated with accessing and utilizing live market feeds, offering a scalable and reproducible environment for quantitative analysis and model validation.
Simulations of market dynamics can be generated using algorithms, such as those developed by Maitrier et al., which incorporate key factors influencing price formation. These algorithms model intraday volatility by simulating price fluctuations at high frequencies, often employing stochastic processes to represent random price movements. Crucially, these simulations also account for trader distributions, moving beyond the assumption of homogeneous actors; models can represent distributions ranging from uniform to power-law, reflecting observed heterogeneity in trading strategies, order sizes, and information access. This allows researchers to explore how different distributions of trader behavior impact overall market characteristics, including liquidity, price discovery, and the prevalence of specific trading patterns.
Varying assumptions regarding trader behavior within market simulations enables the assessment of their influence on overall market dynamics. Simulations can be initialized with models assuming all traders react identically (homogeneous distribution), serving as a baseline for comparison. More complex models utilize power-law distributions, reflecting the empirically observed concentration of trading volume among a small percentage of participants. By systematically altering these distributional assumptions, researchers can quantify the sensitivity of key market characteristics – such as price volatility, order book depth, and the prevalence of specific trading patterns – to the underlying heterogeneity of trader behavior. This allows for a rigorous evaluation of whether observed market phenomena are robust to changes in the assumed distribution of trader characteristics.
Agent-Based Modeling (ABM) and Reaction-Diffusion Simulation represent distinct but synergistic approaches to modeling complex systems. ABM focuses on simulating the actions and interactions of autonomous agents, with emergent system-level behavior arising from these micro-level interactions. In contrast, Reaction-Diffusion Simulation utilizes partial differential equations to model the spatial and temporal dynamics of interacting fields, representing concentrations of signaling molecules or information. While ABM excels at representing heterogeneous agent behavior and complex decision-making processes, Reaction-Diffusion Simulation is well-suited for capturing phenomena driven by diffusion and local interactions. Combining these frameworks allows for a more comprehensive understanding of emergent properties, as ABM can inform the parameters and boundary conditions of Reaction-Diffusion models, and the spatial patterns generated by the latter can, in turn, influence agent behavior within the ABM framework.

Validating Models and Quantifying Market Structure
The determination of appropriate distributions for metaorder lengths and trade volumes relies on statistical techniques such as the Clauset, Shalizi, and Newman (CSN) method. This approach involves evaluating the goodness-of-fit of various theoretical distributions – including power law, exponential, and log-normal – to the observed data. The CSN method utilizes maximum likelihood estimation to identify the distribution that best explains the data, while also providing statistical tests to compare the likelihoods and rigorously assess whether a power-law distribution is a significantly better fit than alternative distributions. This process involves calculating the Kolmogorov-Smirnov (KS) statistic and employing a likelihood ratio test to quantify the evidence supporting a power-law model; the p-value generated from these tests determines the statistical significance of the fit.
The observation of power-law distributions in empirical market data provides support for the Log-Moment Fluctuation (LMF) Theory. Specifically, the prevalence of these distributions – where a relatively small number of events account for a disproportionately large share of the total activity – indicates that market behavior may be scale-free. Scale-free systems lack a characteristic scale, implying similar patterns emerge across different time and price levels. This contrasts with systems governed by normal distributions, which exhibit predictable peaks and declines. The identification of power-law behavior in metrics such as metaorder lengths and trade volumes suggests that market dynamics are driven by mechanisms that do not rely on a central, typical value, but rather on infrequent, large-scale events.
The data utilized in this research originates from the Johannesburg Stock Exchange (JSE), accessed through the BMLL Data Lab Platform. This platform provides tick-by-tick order book data, encompassing all trading activity across multiple instruments and a multi-year historical period. Specifically, the dataset includes timestamps, price, size, and order type for every trade and limit order, enabling a granular analysis of market microstructure. The JSE was selected due to its established market depth and liquidity, providing a robust dataset for validating the proposed models. Data from the BMLL platform underwent standard pre-processing, including error checking and data cleaning, to ensure accuracy and consistency before being used in simulations and statistical analyses.
Synthetic data generated through simulations consistently demonstrates the Squared Root Law, which quantifies the relationship between order size and resulting price impact. Specifically, the law posits that price impact is proportional to the square root of the order size; larger orders predictably induce larger price movements, but at a decreasing rate. This relationship, expressed as Price Impact \propto \sqrt{Order Size}, was observed across multiple simulation runs and parameter configurations. The consistency of this pattern within the synthetic data provides a validating benchmark for assessing the realism of the model and its alignment with observed market dynamics.
Analysis of empirical data confirms a key relationship predicted by the Log-Moment Fluctuation Theory (LMF): the decay exponent of trade sign autocorrelations (γ) is directly linked to the power-law exponent of metaorder lengths (α) via the equation \gamma = \alpha - 1. This finding establishes a quantitative connection between the temporal correlations of trade signs and the statistical properties of order book events. Specifically, the observed values of γ and α consistently satisfy this relationship, providing strong validation for the LMF Theory’s predictions regarding the underlying stochastic processes governing market microstructure. This linkage suggests a fundamental connection between the dynamics of order flow and the resulting price impact patterns.

The study meticulously dissects market order flow, revealing patterns of long-memory processes through metaorder analysis. It strips away superfluous complexity to illuminate the underlying power-law distribution governing trade durations. This echoes a sentiment articulated by Marcus Aurelius: “Choose not to be troubled by what others do, but by what you have not done.” The research doesn’t dwell on speculative market noise, but focuses on what has been done – the demonstrable correlation between metaorder lengths and autocorrelation functions. By prioritizing observable data and mathematical rigor, the paper achieves a clarity that transcends the inherent complexity of financial markets, aligning with the principle of paring away the unnecessary to reveal fundamental truths.
What Remains?
The validation of Long-Memory processes through metaorder analysis, while satisfying in its internal consistency, merely reframes the essential question. A system that requires identifying ‘metaorders’ to explain its behavior has already conceded a fundamental failure of simplicity. The observed power-law distributions, even when reliably reproduced from public data, are descriptive, not generative. One does not explain turbulence by cataloging eddies. The persistent focus on correlation functions, however elegantly modelled, remains a symptom-chasing exercise.
Future work will undoubtedly refine the identification algorithms, seeking ever-smaller deviations from theoretical predictions. A more fruitful, though considerably more difficult, path lies in the search for underlying mechanisms. What, fundamentally, compels market order flow to exhibit these long-range dependencies? Is it an emergent property of collective behavior, or a consequence of unacknowledged, systemic biases? The elegance of LMF theory should not be mistaken for an explanation.
Ultimately, clarity is courtesy. A complete theory will not require the construction of increasingly complex mathematical scaffolding to describe observed phenomena. It will simply be. The pursuit of ever-more-detailed models, divorced from mechanistic understanding, risks becoming an exercise in aesthetic complexity, a baroque ornamentation upon an empty core.
Original article: https://arxiv.org/pdf/2602.19590.pdf
Contact the author: https://www.linkedin.com/in/avetisyan/
See also:
- 2025 Crypto Wallets: Secure, Smart, and Surprisingly Simple!
- Gold Rate Forecast
- Brown Dust 2 Mirror Wars (PvP) Tier List – July 2025
- Banks & Shadows: A 2026 Outlook
- HSR 3.7 story ending explained: What happened to the Chrysos Heirs?
- ETH PREDICTION. ETH cryptocurrency
- The 10 Most Beautiful Women in the World for 2026, According to the Golden Ratio
- Uncovering Hidden Groups: A New Approach to Social Network Analysis
- Gay Actors Who Are Notoriously Private About Their Lives
- 9 Video Games That Reshaped Our Moral Lens
2026-02-25 04:42