science

Probability Functions for Unbiased Statistical Estimations in multi-filter surveys

The greatest advances in the galaxy formation and evolution topic in the last decade have been possible thanks to systematic extragalactic surveys, both photometric and spectroscopic. However, even if the general trends in galaxy properties are qualitatively established, the particular physical processes causing these trends and their relative role in galaxy formation are still under debate. To unveil such physical processes and their role, we have to quantify with exquisite details not only the distributions of galaxy properties, but also their intrinsic (physical) dispersions and possible correlations.

The most powerful survey of the next decade to derive precision galaxy distributions is the large-area multi-filter photometric survey J-PAS (Javalambre – Physics of the accelerating universe Astrophysical Survey). With 56 narrow-band filter (~145Å) observations over 8500deg2, J-PAS will provide R ~ 50 photo-spectra of about 200.000.000 galaxies, leading to a redshift precision of ~1000 km/s, and allowing emission line and stellar continuum measurements.

However, the statistical J-PAS strength is also its main challenge: with statistical uncertainties being no longer a problem, the systematics in the analysis techniques will dominate the final error budget in our measurements. With usual photometric techniques prone to known biases, and a too low resolution (R ~ 50) to successfully apply usual spectroscopic techniques, new and suited methodologies are mandatory to extract robust, unbiased, and accurate J-PAS galaxy distributions for the next decade astrophysics.

The PROFUSE project uses PRObabilty Funtions for Unbiased Statistical Estimations in multi-filter surveys to address the J-PAS technical challenges. The PROFUSE project is developing novel techniques that use posterior probability distribution functions (PDFs) to analyse multi-filter survey data, because even if the posterior PDFs are recognized as the right way to deal with photometric redshifts and Bayesian inference is widely used to estimate galaxy properties, current distribution estimators assume galaxies with a fixed z, luminosity, stellar mass, etc. However, given the probabilistic nature of the photometric redshifts, any galaxy property becomes probabilistic and thus the posterior PDFs of galaxy properties must be tracked along all the analysis process to ensure unbiased posterior galaxy distributions. The PROFUSE project will apply the developed PDF analysis techniques to deal with luminosity/mass functions, emission lines and stellar populations, the environment, and the AGN activity from the local Universe to the high redshift regime at z > 2.5 using the J-PAS, J-PLUS, and ALHAMBRA data.

Probability Distribution Functions (PDFs)

Nowadays photometric redshift codes such as LePhare, EAZY, BPZ, or TPZ provide regularly reliable redshifts, luminosities, stellar masses, etc. Most of these codes compute the merit function χ2 by comparing the observational data with a set of templates, either empirical or theoretical. Then, the posterior probability in redshift (z) and spectral template (T) is defined as PDF(z,T) ∝ P(I) exp(–χ2/2), where P(I) is the prior probability (Benítez00). Until now, photometric surveys release to the community the best Bayesian photometric redshift (zb), estimated as the median of the PDF, and the uncertainty in this zb, estimated as the redshift range that encloses 68% of the PDF. However, it is known that this Gaussian approach lead to biases (Sheth+Rossi10), and the full PDF information should be used instead. The ALHAMBRA and J-PAS surveys will release full redshift-template PDFs, with their associated luminosities and stellar masses. These data sets are the starting point to derive precision galaxy distributions with the PROFUSE techniques.

Full posterior PDF(z,T) retrieved by BPZ for an ALHAMBRA source Redshift projection of PDF(z,T), black solid line.

Left panel. Full posterior PDF(z,T) retrieved by BPZ for an ALHAMBRA source (red areas mark higher probabilities). The best Bayesian redshift (zb = 0.18) and spectral type (Tb = Elliptical) are marked with a white dot. Right panel. Redshift projection of PDF(z,T), black solid line. The Gaussian approach (red dashed line), zb = 0.18 ± 0.10, is a poor description of PDF(z). Statistical studies of early spectral types (elliptical and lenticular, red area) and late spectral types (spirals and starbursts, blue area) are possible without sample pre-selection because PDF(z) = PDF(z | T = E/S0) + PDF(z | T = S/SB).