Seeing Double: AI Spots Hidden Galaxy Mergers at Cosmic Noon

Author: Denis Avetisyan


A new deep learning approach dramatically improves our ability to identify galaxy mergers, even faint and distant ones, revealing a more complete picture of galactic evolution.

The network accurately distinguishes merging galaxies based on merger mass ratio and stellar mass, focusing on the central galaxy during prediction, and reveals a clear correlation between these properties and both stellar mass and star formation rate as visualized through dimensionality reduction.
The network accurately distinguishes merging galaxies based on merger mass ratio and stellar mass, focusing on the central galaxy during prediction, and reveals a clear correlation between these properties and both stellar mass and star formation rate as visualized through dimensionality reduction.

Convolutional neural networks trained on the IllustrisTNG simulation demonstrate superior performance in classifying mergers, including low-mass galaxies and those at high redshifts.

Identifying galaxy mergers-critical events in galactic evolution-remains challenging, particularly at high redshifts and for low-mass systems. This limitation motivates the work ‘Toward Complete Merger Identification at Cosmic Noon with Deep Learning’, which presents a deep learning approach trained on realistic mock observations from the IllustrisTNG50 simulation. The resulting model demonstrates, for the first time, the successful identification of even minor and low-mass mergers at $1 < z < 1.5$ with an overall accuracy of 73%. Can this methodology, and further refinement of feature extraction, ultimately unlock a more complete census of mergers and illuminate the processes driving galaxy assembly at cosmic noon?


The Echo of Collisions: Identifying Galactic Mergers

The evolution of galaxies is fundamentally shaped by interactions and, crucially, mergers with other galaxies. Determining which galaxies have undergone these transformative events, however, presents a significant challenge. Historically, astronomers have relied on visual inspection of telescope images to identify merging galaxies, a process that demands considerable time and expertise. This manual approach is not only slow, limiting the study of vast astronomical datasets, but also inherently subjective; different observers may arrive at different conclusions when assessing the same images. This subjectivity introduces uncertainty into the analysis and hinders the ability to establish robust statistical trends in galaxy evolution. Consequently, a need exists for more efficient and objective methods to reliably identify galaxy mergers and unlock a comprehensive understanding of how these cosmic structures assemble and change over time.

Automated galaxy merger identification faces significant hurdles when confronted with the intricacies of astronomical imagery. While techniques like non-parametric methods and machine learning algorithms – including Random Forest and Linear Discriminant Analysis – offer potential for efficiency, their performance is often compromised by the inherent complexity of real-world data. These methods frequently depend on extracting relatively simple features from galaxy images, which proves inadequate for capturing the subtle distortions and overlapping light distributions characteristic of merging systems. This limitation is particularly pronounced when analyzing galaxies at greater distances – higher redshifts – where images are fainter and more distorted, making accurate classification exceedingly difficult. Consequently, current automated approaches often struggle to distinguish true mergers from chance alignments or other complex galactic interactions, hindering efforts to comprehensively study galaxy evolution through mergers.

Current automated techniques for identifying galaxy mergers frequently falter due to an over-reliance on basic morphological features – characteristics like size, brightness, and smoothness – which prove insufficient when analyzing the subtle distortions caused by merging galaxies. This limitation is particularly pronounced when observing galaxies at high redshifts, where their light has been significantly stretched and dimmed by the expansion of the universe. The resulting faint and blurred images obscure the finer details crucial for accurate merger classification, causing these methods to misidentify merging systems or overlook them entirely. Consequently, the statistical understanding of galaxy assembly, which relies on a complete census of mergers, remains incomplete, necessitating the development of more sophisticated feature extraction and classification algorithms capable of discerning faint signals from cosmic noise and accurately characterizing these distant interactions.

The current limitations in identifying galaxy mergers necessitate the development of more sophisticated analytical techniques. Existing methods, while useful, often falter when confronted with the intricacies of astronomical images, especially those capturing galaxies at vast distances – hindering a complete understanding of how galaxies grow and evolve. A truly robust approach demands not only increased accuracy in classifying mergers, but also the capacity to process the immense datasets generated by modern telescopes. Such a scalable system would allow astronomers to move beyond small-scale studies and construct a comprehensive picture of galaxy assembly, revealing the relative frequency of mergers across cosmic time and their impact on galactic properties like shape, size, and star formation rates. Ultimately, overcoming this analytical bottleneck promises to unlock a wealth of information about the universe’s history and the processes that have shaped the galaxies within it.

Simulating Creation: A Foundation of Labeled Data

The foundation of our training dataset is comprised of mock galaxy images generated using the IllustrisTNG cosmological simulation. IllustrisTNG is a large-scale, high-resolution cosmological simulation that models the formation and evolution of galaxies and their constituent components. By utilizing the simulation’s outputs, we created a substantial corpus of synthetic galaxy observations, effectively providing labeled data for training machine learning algorithms. The simulation provides data on stellar populations, gas distribution, and dark matter halos, all crucial factors in determining a galaxy’s observable characteristics. This approach allows for the creation of a training set significantly larger and more controlled than is currently available from purely observational sources, enabling robust model development and validation.

The SKIRT radiative transfer code was implemented to increase the fidelity of simulated galaxy images by accurately modeling the interaction of light with dust and Active Galactic Nuclei (AGN). SKIRT calculates the propagation of photons through the simulated galaxy, accounting for absorption, scattering, and emission by dust grains. This process realistically simulates the obscuration of starlight by dust, particularly within galactic disks and near the AGN. By incorporating these effects, the generated images exhibit more accurate spectral energy distributions and morphological features, better reflecting the appearance of real galaxies as observed through telescopes. The code handles complex geometries and dust distributions, allowing for a detailed representation of radiative transfer processes within each simulated galaxy.

The simulated galaxy images were convolved with a Point Spread Function (PSF) using the Tiny Tim software package to accurately represent the blurring effect introduced by telescope optics. This process simulates how a telescope’s resolution limits the observed details of distant galaxies. Tiny Tim calculates the PSF based on parameters specific to the Hubble Space Telescope, including wavelength and detector characteristics. By applying this PSF to the mock images, the simulation data more closely matches the appearance of real astronomical observations, enabling more effective training of machine learning algorithms designed to analyze telescope data.

The final stage of training data generation involved rendering the simulated galaxy images through a set of filters mirroring those used in the CANDELS survey. This process created three-channel images – representing data from the blue, green, and red filters – effectively mimicking the color information captured by astronomical telescopes. Utilizing CANDELS filters ensured the generated images possessed spectral characteristics closely aligned with real observed galaxy images, allowing for more effective model training and generalization to observational data. This three-channel format is standard for astronomical image processing and compatible with common machine learning architectures designed for image analysis.

Decoding the Collisions: A Deep Learning Approach

The ResNet18 convolutional neural network architecture was selected for its computational efficiency and established performance in image classification tasks. To mitigate the need for extensive training from random initialization, the network’s weights were pre-initialized using the parameters from the Zoobot2.0 model, which was previously trained on a large dataset of galaxy images. This transfer learning approach significantly reduced training time and improved model convergence by leveraging existing feature detectors learned by Zoobot2.0, allowing the network to more rapidly adapt to the specific task of merger classification. The ResNet18 architecture consists of 18 layers, utilizing residual connections to facilitate gradient flow during training and enable the learning of deeper representations.

The ResNet18 network was trained using a dataset of synthetically generated mock images depicting galaxy merger events. These images were labeled to differentiate between major mergers – defined as events involving galaxies of comparable mass – and minor mergers, where a smaller galaxy merges with a significantly larger one. The training process utilized supervised learning, with the network adjusting its internal parameters to minimize the error between its predicted merger classification and the ground truth labels associated with each mock image. The resulting model aims to accurately categorize observed galaxy pairs based on the characteristics indicative of either a major or minor merger event, enabling statistical analysis of merger rates and properties.

Gradient-weighted Class Activation Mapping (Grad-CAM) was implemented to provide visual explanations for the ResNet18 network’s merger classification decisions. This technique calculates the gradients of the target class with respect to the final convolutional layer’s feature maps, effectively identifying which regions of the input image most influence the network’s prediction. These gradients are then used to weight the feature maps, which are subsequently combined to produce a coarse localization map highlighting the pixels most relevant to the classification. The resulting heatmaps visually demonstrate which features – such as tidal tails, bridges, or disturbed morphologies – the network utilizes to distinguish between Major and Minor Merger events, offering insight into the model’s learned representations.

Dimensionality reduction using Uniform Manifold Approximation and Projection (UMAP) was applied to the high-dimensional latent space generated by the penultimate layer of the trained ResNet18 network. This process reduced the dimensionality to two dimensions for visualization purposes, allowing for the identification of emergent patterns in the data. Analysis of the resulting UMAP embedding revealed the formation of discrete clusters, with distinct groupings demonstrably correlated with the labeled merger types – Major Merger and Minor Merger – as well as intrinsic galaxy properties such as stellar mass and morphology. These clusters indicate the network effectively learned a representation where galaxies undergoing similar merger events or possessing comparable characteristics are positioned closer to each other in the latent space.

Echoes from the Past: Unveiling Merger Insights

A ResNet18 neural network, meticulously trained on simulated galaxy merger data, exhibits a noteworthy capability in identifying merging galaxies at high redshifts. Achieving an accuracy of approximately 73%, the network reliably distinguishes merging systems from single galaxies, even amidst the complexities of distant astronomical observations. This performance signifies a substantial advancement in automated merger identification, offering a powerful tool for analyzing the evolution of galaxies in the early universe. The network’s proficiency allows researchers to efficiently survey large datasets, uncovering merger events that would be challenging to detect through traditional methods, and ultimately refining our cosmological models – a humbling reminder that even our most sophisticated theories are merely reflections of the universe, prone to vanishing beyond the event horizon of new discoveries.

This research demonstrates a significant advancement in identifying galactic mergers, particularly among distant, high-redshift galaxies. The trained convolutional neural network successfully detects mergers involving galaxies as small as $10^8$ solar masses, and even those where the merging galaxies have a mass ratio of 1:10 – meaning one galaxy is a tenth the mass of the other. This represents a substantial improvement over prior studies, which typically focused on more massive galaxies exceeding $10^9$ solar masses. By extending merger identification to these lower mass regimes, the network opens new avenues for understanding how smaller galaxies contribute to the growth of larger structures and the overall evolution of the universe, offering crucial insights into the processes shaping galactic populations across cosmic time – a process of accretion and disruption that mirrors, in a grander scale, the very formation of our own beliefs.

The newly trained ResNet18 network’s ability to accurately identify mergers in high-redshift galaxies represents a significant step forward in observational cosmology, particularly when considered alongside existing research. Previous studies employing similar methodologies have largely focused on more massive galaxies – those with stellar masses exceeding $10^9$ solar masses – and at lower redshifts, specifically between 0.1 and 1. This work demonstrates a comparable level of performance in identifying mergers, even when applied to the more distant and typically fainter high-redshift population, suggesting the developed network effectively captures the morphological signatures of merging galaxies across a broader cosmic timescale and stellar mass range. This consistency validates the network’s approach and provides a crucial benchmark for future investigations into the role of mergers in galaxy evolution.

Investigating the specific star formation rate (sSFR) within merging galaxies offers a crucial window into how these dramatic cosmic collisions fuel or suppress stellar birth. Analyses reveal that mergers don’t always result in an immediate burst of star formation; instead, the sSFR exhibits a complex relationship with the merger stage and the masses of the colliding galaxies. Initially, the gravitational disruption caused by the merger can compress gas clouds, triggering intense star formation, increasing the sSFR. However, feedback processes, such as supernovae and active galactic nuclei, can subsequently quench star formation, potentially leading to a decrease in sSFR. By meticulously quantifying these changes in sSFR throughout the merger process, researchers aim to unravel the intricate interplay between mergers and the evolution of galaxies, providing valuable insights into the formation of the structures observed in the universe today – a testament to the universe’s capacity to both create and destroy, a cycle that may well define the fate of our own understanding.

The pursuit of identifying galaxy mergers, as detailed in this work, reveals a fundamental challenge in theoretical cosmology: the limitations of any predictive model when confronted with extreme gravitational regimes. Current quantum gravity theories suggest that inside the event horizon spacetime ceases to have classical structure, echoing a sentiment captured by Richard Feynman: “The first principle is that you must not fool yourself – and you are the easiest person to fool.” The network’s ability to discern mergers, particularly at cosmic noon and among low-mass galaxies, pushes the boundaries of existing observational capabilities. However, the reliance on simulations like IllustrisTNG-while mathematically rigorous-highlights the inherent difficulty in definitively verifying such findings, a humbling reminder that even the most sophisticated models are subject to the constraints of our understanding and observational access.

What Lies Beyond the Horizon?

The demonstrated capacity to identify galaxy mergers – even those faint echoes at cosmic noon – using convolutional neural networks trained on simulations offers a momentary respite from the inherent limitations of observation. However, the fidelity of any such identification rests entirely upon the accuracy of the underlying simulation – the IllustrisTNG model, in this instance. To mistake the map for the territory is a perennial hazard, and the potential for systematic error embedded within the training data remains a significant, largely unquantifiable, concern. Any claim of identifying previously unseen mergers thus carries an implicit acknowledgment of reliance on a constructed reality.

Future work must address the transferability of these methods to actual observational data. The subtle distortions introduced by radiative transfer in the simulation, while skillfully modeled, are still approximations of a chaotic process. Furthermore, the dimensionality reduction achieved through techniques like UMAP, while computationally efficient, inevitably discards information. The crucial question is whether the discarded information obscures genuine merger signatures.

Ultimately, the success of this approach, like all attempts to decipher the universe, is provisional. A more complete understanding requires not simply improved algorithms or larger simulations, but a willingness to confront the inherent unknowability at the heart of cosmological inquiry. The event horizon is not merely a boundary in spacetime; it is a mirror reflecting the limits of human comprehension.


Original article: https://arxiv.org/pdf/2511.15006.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

See also:

2025-11-20 14:15