EXXA is a Google Summer of Code (GSoC) project that focuses on using machine learning to characterizing forming and existing exoplanets using both synthetic and observational multi-modal data.
Exoplanets are planets that are outside of our Solar System. To date, roughly 6,000 exoplanets have been confirmed using a variety of detection methods. Identifying and characterizing exoplanets informs out theories of planet formation and may allow us to test astrobiological hypotheses.
Population of exoplanets as of 2017 (NASA/Ames Research Center/Natalie Batalha/Wendy Stenzel).
There is currently a revolution ongoing in the field of exoplanets. New observatories have created unprecedented opportunities to detect and study these bodies in ways that were previously impossible. Missions such as the Transiting Exoplanet Survey Satellite (TESS) has measured transit signatures of thousands of potential exoplanets. The James Webb Space Telescope (JWST) allows us to measure the composition of the atmospheres of exoplanets. The Atacama Large millimeter/submillimeter Array (ALMA) observatory gives us a new view of protoplanetary disks, the sites of planet formation, and allows us to study the environments and results of ongoing planet formation. Together, these, and other, observatories have given us a wealth of data that will continue to provide discoveries for years to come.
Machine learning has been proven to be powerful tool in analyzing the trove of observation data. Different observatories create different types of data, including spectra, light curves, and images. Each type of data provides a different set of tasks, challenges, and information. This creates a situation in which a variety of machine learning techniques can be used for a broad set of analysis objectives. Because the ground truth of observations is often unknown, researches may rely on the creation of synthetic data through methods such as simulations to train models on known parameters before deployment on real datasets.
The purpose of EXXA is to both synthetic and observational data to perform many different tasks related to exoplanets. EXXA focuses on two main areas: exoplanet atmospheres and protoplanetary disks. There are different tasks for each area. The objectives of the atmosphere projects are mainly to identify chemical species to understand information such as the composition, weather, and potential habitability of planets. Protoplanetary disks are analyzed to identify planets, with intermediate goals including denoising the observations.
The projects have resulted in, e.g., publications and conference talks. Past GSoC projects have focused on
-
Diffusion models to denoise disk observations (Faithful Chukwunwogor)
-
Foundation models for general disk characterization (Tanmay Singhal)
Upcomming projects will expand on these results and may include additional capabilities, such as planet segmentation and simulation-based inference.
The mentors for these projects are
- Katia Matcheva (University of Alabama)
- Konstantin Matchev (University of Alabama)
- Sergei Gleyzer (University of Alabama)
- Jason Terry (Oxford University)
- Alex Roman (University of Alabama)
- Emilie Panek (University of Alabama)

