Uncertainties¶
Since most exoplanet properties are derived from measurements, most have uncertainties. The reliability of a visualization or calculation depends crucially on at least a qualitative understanding of the uncertainties associed with each quantity. To try to help with this, exoatlas
provides an interface to both uncertainties that are reported in original archive tables and propagated uncertainties estimated for calculated quantities.
import exoatlas as ea
import astropy.units as u
import numpy as np
ea.version()
'0.6.6'
This page expands the very brief discussion of uncertainties on (Populations)[populations.ipynb], providing more details and a little explanation of how uncertainty estimates are calculated.
pop = ea.TransitingExoplanets()
How do we retrieve uncertainties?¶
We will often want to know the uncertainty on a particular quantity. We can retrieve this either with the .get_uncertainty()
method, or by appending _uncertainty
to the name of a quantity. For core table quantities, uncertainties are extracted directly from the table.
sigma = pop.get_uncertainty("radius")
sigma
sigma = pop.radius_uncertainty()
sigma
Some uncertainties might be asymmetric, with different upper and lower uncertainties, such as
$x^{+\sigma_{upper}}_{-\sigma_{lower}}$. We can extract these asymmetric uncertainties with .get_uncertainty_lowerupper()
or by appending _uncertainty_lowerupper
.
sigma_lower, sigma_upper = pop.get_uncertainty_lowerupper("stellar_teff")
sigma_lower, sigma_upper
(<Quantity [18.58195665, 76.62327465, 91.74371644, ..., 94.64370696, 85.48009621, 58.59171529] K>, <Quantity [ 18.78092998, 97.08052035, 92.01112702, ..., 121.34965881, 77.71241247, 65.29424924] K>)
sigma_lower, sigma_upper = pop.stellar_teff_uncertainty_lowerupper()
sigma_lower, sigma_upper
(<Quantity [ 18.24779791, 87.93927775, 86.64959372, ..., 118.48269131, 77.21096959, 54.62575022] K>, <Quantity [ 20.47451643, 77.6029923 , 82.2244292 , ..., 108.73596189, 86.24394156, 65.91795984] K>)
We can force asymmetric uncertaintoies to be symmetric, calculated as $\sigma = (\sigma_{lower} + \sigma_{upper})/2$, just by asking for the a simple symmetric uncertainty.
sigma = pop.get_uncertainty("stellar_teff")
sigma
We can also estimate uncertainties on derived quantities in the same way. Behind the scenes, uncertainties on derived quantities are estimated using astropy.uncertainty
. Samples are created for each ingredient table column using skew-normal distributions for asymmetric uncertainties as advocated by Pineda et al. (2021), and estimated errors are based on the central 68% confidence intervals of the calculated distributions.
pop.get_uncertainty("scale_height")
pop.scale_height_uncertainty()
We might commonly be interested in the fractional uncertainty on a quantity. We can either calculate this ourselves, or use the .get_fractional_uncertainty
wrapper.
pop.get_uncertainty("scale_height") / pop.get("scale_height")
pop.get_fractional_uncertainty("scale_height")
Keyword arguments can be supplied when calculating derived quantities, to be passed into the function that actually does the calculating.
pop.teq_uncertainty(albedo=0.5)
pop.get_uncertainty("teq", albedo=0.5)
pop.teq_uncertainty_lowerupper(albedo=0.5)
(<Quantity [10.55413731, 16.9123918 , 11.98695726, ..., 7.58483732, 10.10879011, 6.24502639] K>, <Quantity [11.95132393, 16.46046787, 11.6453523 , ..., 8.1694019 , 12.13379926, 6.97416094] K>)
pop.get_uncertainty_lowerupper("teq", albedo=0.5)
(<Quantity [10.4075294 , 16.8588789 , 13.29849687, ..., 9.1999765 , 10.68371021, 6.52680909] K>, <Quantity [12.29728108, 14.59660064, 11.35355295, ..., 7.45432773, 11.58893597, 6.88670664] K>)
How do we get more precise propagated uncertainties?¶
Propagated uncertainties are calculated by generating lots of numerical samples for each quantity for each planet, calculating derived quantities, and then estimated confidence intervals from the calculated samples. To avoid memory issues on larger planet populations, the default number of samples to use for these distributions is $\sf N_{samples}=100$. That is not enough to achieve precise uncertainty estimates, so in practice we loop over $\sf N_{iteration}$ iterations calculating uncertainties, and average the results together. We target a desired fractional uncertainty on the uncertainties $\sf f$ by noting that $\sf f \approx \sqrt{1/N_{total}}$, where $\sf N_{total} = N_{samples}\cdot N_{iteration}$ is effectively the total number of samples we generate. By default, we target $\sf f = 0.05$, so $\sf N_{iterations} = 4$ iterations are needed.
In this example, we'll look in detail at the uncertainties for a small subset population. We'll crudely estimate the fractional uncertainty on the uncertainties by doing two independent calculations and looking at their difference.
subset = pop[:5]
subset.targeted_fractional_uncertainty_precision
0.05
a = subset.get_uncertainty("teq")
b = subset.get_uncertainty("teq")
average_uncertainty = 0.5 * (a + b)
fractional_uncertainty_on_uncertainty = np.abs(a - b) / average_uncertainty
average_uncertainty, fractional_uncertainty_on_uncertainty
(<Quantity [13.24304255, 18.77018834, 13.1621287 , 12.00061892, 30.29487046] K>, <Quantity [0.0442395 , 0.15434107, 0.11908088, 0.08880208, 0.06353546]>)
If we want to improve the fractional uncertainty, we can update the secret variable .targeted_fractional_uncertainty_precision
to target a lower value. Here, let's aim for 1% fractional precision. Calculating uncertainties will now take longer because we need to perform more iterations.
subset.targeted_fractional_uncertainty_precision = 0.01
a = subset.get_uncertainty("teq")
b = subset.get_uncertainty("teq")
average_uncertainty = 0.5 * (a + b)
fractional_uncertainty_on_uncertainty = np.abs(a - b) / average_uncertainty
average_uncertainty, fractional_uncertainty_on_uncertainty
(<Quantity [13.26642567, 18.48460305, 14.36776958, 12.52524798, 28.68025099] K>, <Quantity [0.00612867, 0.00036632, 0.00540164, 0.02719213, 0.01886679]>)