A machine-learning method for identifying multiwavelength counterparts of submillimeter galaxies

Submillimeter-luminous galaxies (SMGs) are massive, dust-enshrouded systems which are forming stars at rates of 100-1,000 times faster than the Milky Way currently does. The extreme star formation rates enable them to form the stellar mass of massive galaxies within just ~100 Myr while most normal star-forming galaxies, such as the Milky way, have grown over a much more extended period (~few billions of years). Due to their potentially rapid formation, SMGs have been proposed to be the progenitors of spheroidal galaxies in the local Universe. They are also thought to be linked to quasi-stellar object (QSO) activity due the similarity of their redshift distribution to that of luminous QSOs. These characteristics mean that SMGs may be an important stage in the formation and evolution of massive galaxies and hence are a key population for us to understand the formation and evolution of galaxies in the Universe.

However, the main challenge for follow-up studies of the sources selected from the submillimeter surveys is the coarse angular resolution of the single-dish maps which results in uncertain identifications of the counterparts at other observed frequencies. To address this, the CEA astronomers developed a machine-learning method to identify the likely multi-wavelength counterparts to submillimeter sources by utilising a training set of precise identified SMGs from the ALMA (the high-resolution (sub)millimetre instrument in Chile) follow-up mapping of the SCUBA-2 Cosmology Legacy Survey's UKIDSS-UDS field. The developed radio+machine-learning method is able to successfully recover ~85 percent of ALMA-identified SMGs which are detected in at least three bands from the ultraviolet to radio as show in the following figure.

The robustness of the developed method is confirmed by several independent tests. In future, we will apply this method to samples drawn from panoramic single-dish submillimeter surveys which currently lack interferometric follow-up observations, to address science questions which can only be tackled with large, statistical samples of SMGs.

 The self-test results of our radio+machine-laerning method

This figure shows the self-test results of our radio+machine-laerning method. The blue squares are machine-learning classified counterparts of SMGs. The machine-learning recovers 75% of ALMA SMGs (red points). By including radio identifications (green circles), the completeness of our method reaches 85%.

Link to the original research paper: https://arxiv.org/abs/1806.06859

Contacts from CEA, Durham:

Fangxia An

Stuart Stach

Ian Smail

Mark Swinbank

Julie Wardlow

Elizabeth Cooke

Bitten Gullberg