Using Skew Normal Distributions for Asymmetric Uncertainties¶
Many exoplanet parameters often have asymmetric errorbars, expressed as something like $$\mu^{+\sigma_{upper}}_{-\sigma_{lower}}$$ where $\mu$ is some estimate of central tendency, and the $\sigma$s are the uncertainties. In exoatlas
, we follow Pineda et al. (2021) to use skew normal distributions to approximate the probability distributions of table quantities with unequal upper and lower uncertainties.
Creating a Samples with a Skew Normal¶
For some quantity $\mu^{+\sigma_{upper}}_{-\sigma_{lower}}$, we can make samples from a skew normal that treats $\mu$ as the mode of the distribution and $\mu - \sigma_{lower}$ to $\mu + \sigma_{upper}$ as the central $68\%$ confidence interval. Behind the scenes, this interpolates from a table of skew normal coefficients depending on the ratio $\sigma_{upper}/\sigma_{lower}$, generates some samples, and renormalizes them to the requested $\mu$, $\sigma_{upper}$, and $\sigma_{lower}$.
from exoatlas.populations.pineda_skew import *
To see how this looks for different levels of asymmetry, you can run this test function, which shows the generated samples on a violin plot.
from exoatlas.tests import test_skew
test_skew()
Understanding the Skew Normal¶
The skew normal depends on a location $\mu$, a scale $\sigma$, and an asymmetry parameter $\alpha$, which can be positive or negative.
plot_skewnormal(mu=0, sigma=1, alpha=5)
Setting Up Interpolation Tables¶
Pineda et al. provide code to numerically solve for the parameters of a skew normal distribution, given a set of asymmetric error bars. Since this takes a little time and can occasionally be a little unstable, for exoatlas
we derive a table of coefficients and interpolate from it based on how asymmetric the errors are using $\sigma_{upper}/\sigma_{lower}$.
Here we derive those tables, assuming iterating a few times, using a smoothed version of the previous results as guesses to the subsequent iterations. We derive tables assuming $\mu$ represents either the mode
or the median
of the distribution, and that the range of $\mu - \sigma_{lower}$ to $\mu + \sigma_{upper}$ represents the central $68\%$ confidence interval. We save the resulting table into the code, so most folks will never actually need to run this code directly.
tables = make_skewnormal_parameters_to_interpolate()
saved parameter interpolating table to skewnormal_parameters_for_interpolating_mode.ecsv
saved parameter interpolating table to skewnormal_parameters_for_interpolating_median.ecsv
Practically, we use the mode (= "Peak") because it's impossible to achieve errorbars more than about 50% asymmetric if we treat $\mu$ as representing the median.