Example: quiz answers

Hierarchical Bayesian Modeling - Pennsylvania State University

Hierarchical Bayesian ModelingAngie Wolfgang NSF Postdoctoral Fellow, Penn Stateabout a populationMaking scientific inferencesbased on many individualsAstronomical PopulationsSchawinski et al. 2014 Lissauer, Dawson, & Tremaine, 2014 Once we discover an object, we look for more ..to characterize their properties and understand their planetsAstronomical PopulationsOr we use many (often noisy) observations of a single objectto gain insight into its physics. Hierarchical Modelingis a statistically rigorous way to make scientific inferences about a population (or specific object) based on many individuals (or observations).Frequentist multi-level Modeling techniques exist, but we will discuss the Bayesian approach : variability of sample(If __ is the true value, what fraction of many hypothetical datasets would be as or more discrepant from __ as the observed one?) Bayesian : uncertainty of inference(What s the probability that __ is the true value given the current data?)

-compositions of individual super-Earths (fraction of mass in a gaseous envelope: fenv) -the distribution of this composition parameter over the Kepler population (μ, σ). Internal Structure Models Population-wide Distributions Likelihood Wanted to understand BOTH: Exoplanet compositions: Wolfgang & Lopez, 2015

Tags:

  Compositions

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Hierarchical Bayesian Modeling - Pennsylvania State University

1 Hierarchical Bayesian ModelingAngie Wolfgang NSF Postdoctoral Fellow, Penn Stateabout a populationMaking scientific inferencesbased on many individualsAstronomical PopulationsSchawinski et al. 2014 Lissauer, Dawson, & Tremaine, 2014 Once we discover an object, we look for more ..to characterize their properties and understand their planetsAstronomical PopulationsOr we use many (often noisy) observations of a single objectto gain insight into its physics. Hierarchical Modelingis a statistically rigorous way to make scientific inferences about a population (or specific object) based on many individuals (or observations).Frequentist multi-level Modeling techniques exist, but we will discuss the Bayesian approach : variability of sample(If __ is the true value, what fraction of many hypothetical datasets would be as or more discrepant from __ as the observed one?) Bayesian : uncertainty of inference(What s the probability that __ is the true value given the current data?)

2 Understanding Bayesx = data = the parameters of a model that can produce the data p() = probability density distribution of | = conditional on , or given p( ) = prior probability (How probable are the possible values of in nature?) p(x| ) = likelihood, or sampling distribution (Ties your model to the data probabilistically: how likely is the data you observed given specific values?) p( |x) = posterior probability (A new prior distribution, updated with information contained in the data: what is the probability of different values given the data and your model?)Bayes Theoremp( |x) p(x| ) p( )(straight out of conditional probability)posteriorlikelihoodprior(We just learned how to evaluate p( |x) numerically to infer from x ..) (But let s get a better intuition for the statistical model itself.)Applying Bayesp( |x) p(x| ) p( )posteriorlikelihoodpriorExample (1-D): Fitting an SED to photometryx = 17 measurements of L = age of stellar population, star formation timescale , dust content AV, metallicity, redshift, choice of IMF, choice of dust reddening lawNelson et al.

3 2014 Model: Stellar Population SynthesisModel can be summarized as f(x| ): Maps this is NOT p(x| ) because f(x| ) is not a probability distribution!!xf(x| )Applying Bayesp( |x) p(x| ) p( )posteriorlikelihoodpriorExample (1-D): Fitting an SED to photometryx = 17 measurements of L = age of stellar population, star formation timescale , dust content AV, metallicity, redshift, choice of IMF, choice of dust reddening lawModel: Stellar Population SynthesisModel can be summarized as f(x| ): Maps this is NOT p(x| ) because f(x| ) is not a probability distribution!!If use 2 for fitting, then you are implicitly assuming that:p(xi| ) =where = f(xi| ) and = statistical measurement error you are assuming Gaussian noise (if you could redo a specific xi the same way many times, you d find:) Applying Bayesp( |x) p(x| ) p( )posteriorlikelihoodpriorExample (2-D): Fitting a PSF to an imagex = matrix of pixel brightnesses = , of Gaussian (location, FWHM of PSF)f(x| ) = 2-D Gaussian p(x| ) = where = f(x| ) and = noise (possibly spatially correlated)Both likelihood and model are Gaussian!

4 !Applying Bayesp( |x) p(x| ) p( )posteriorlikelihoodpriorExample (1-D): Fitting an SED to photometryx = 17 measurements of L = age of stellar population, star formation timescale , dust content AV, metallicity, redshift, choice of IMF, choice of dust reddening lawModel: Stellar Population SynthesisModel can be summarized as f(x| ): Maps this is NOT p(x| ) because f(x| ) is not a probability distribution!!Ok, now we know of one way to write p(x| ).What about p( )? 1) If we have a previous measurement/inference of that object s metallicity, redshift, etc., use it with its error bars as p( ). (Usually measured via 2, so p( ) is Gaussian with = measurement and = error. BUT full posteriors from previous analysis is better.) 2) Choose wide, uninformative distributions for all the parameters we don t know well. 3) Use distributions in nature from previous observations of similar HierarchicalOption #3 for p( ): Use distributions in nature from previous observations of similar ( ) = n( | )/ n( | )d = p( | )Histograms of population properties, when normalized, can be interpreted as probability distributions for individual parameters:where n( | ) is the function with parameters that was fit to the histogram (or even the histogram itself, if you want to deal with a piecewise function!

5 Ilbert et al. 2009 For example, redshift was part of the for SED fitting. One could use the red lines (parametric form below) as p(z) = p(z| ) = n(z| )/ n(z| )dzwith n(z| ) = and = {a,b,c}.But BE CAREFUL of detection bias, selection effects, upper limits, etc.!!!!!!Going HierarchicalOption #3 for p( ): Use distributions in nature from previous observations of similar ( ) = n( | )/ n( | )d = p( | )Histograms of population properties, when normalized, can be interpreted as probability distributions for individual parameters:where n( | ) is the function with parameters that was fit to the histogram (or even the histogram itself, if you want to deal with a piecewise function!)Population helps make inference on individual ..p( |x) p(x| ) p( )posteriorlikelihoodpriorp( |x) p(x| ) p( | )posteriorlikelihoodprior(Almost there!!)Abstracting again ..Going Hierarchical .. but what if we want to use the individuals to infer things (the s) about the population?p( |x) p(x| ) p( | )posteriorlikelihoodpriorp( , |x) p(x| , ) p( | ) p( )posteriorlikelihoodpriorIf you truly don t care about the parameters for the individual objects, then you can marginalize over them:p( |x) [ p(x| , ) p( | ) d ] p( ) = p(x| ) p( ) , p( | ) contains some interesting physics and getting values for given the data can help us understand :p( |x) p(x| ) p( )posteriorlikelihoodpriorp( , |x) p(x| , ) p( | ) p( )posteriorlikelihoodprior Regular Bayes: Hierarchical Bayes:ObservablesParametersPopulationPar ametersObservablesIndividual ParametersGraphically:p( |x) p(x| ) p( )posteriorlikelihoodpriorp( , |x) p(x| , ) p( | ) p( )posteriorlikelihoodprior Regular Bayes: Hierarchical Bayes:ObservablesParametersPopulationPar ametersObservablesIndividual ParametersphysicsphysicsConditional independence between individuals:Even for an individual object, connection between parameters and observables can involve several layers.

6 (Example: measuring mass of a planet)Latent VariablesMplRVsspectraHBM in Action: Model- compositions of individual super-Earths (fraction of mass in a gaseous envelope: fenv) -the distribution of this composition parameter over the Kepler population ( , ).Internal Structure ModelsPopulation-wide DistributionsLikelihoodWanted to understand BOTH:Exoplanet compositions : Wolfgang & Lopez, 2015 HBM in Action: ResultsExoplanet compositions : Wolfgang & Lopez, 2015 Posterior on population parameters:Marginal composition distribution:Width of distribution had not been previously in Action: ResultsExoplanet compositions : Wolfgang & Lopez, 2015 Posteriors on composition parameter fenv for individual planets:A Note About ShrinkageHierarchical models pool the information in the individual data ..mean of distribution of x suncertainty in x1 when analyzed by which shrinks individual estimates toward the population mean and lowers overall RMS error. (A key feature of any multi-level Modeling !)

7 Uncertainty in x1 when analyzed in Hierarchical modelA Note About Shrinkagemean of distribution of x suncertainty in x1 when analyzed in Hierarchical modeluncertainty in x1 when analyzed by itselfWolfgang, Rogers, & Ford, 2016 Shrinkage in action:Gray = data Red = posteriorsPractical Considerations1)Pay attention to the structure of your model!! Did you capture the important dependencies and correlations? Did you balance realism with a small number of population-level parameters? 2) Evaluating your model with the data (performing Hierarchical MCMC): JAGS ( ; can use stand-alone binary or interface with R) STAN ( ; interfaces with R, Python, Julia, MATLAB) Or write your own Hierarchical MCMC code 3) Spend some time testing the robustness of your model: if you generate hypothetical datasets using your HBM and then run the MCMC on those datasets, how close do the inferences lie to the truth ?In Sum, Why HBM? Obtain simultaneous posteriors on individual and population parameters: self-consistent constraints on the physics Readily quantify uncertainty in those parameters Naturally deals with large measurement uncertainties and upper limits (censoring) Similarly, can account for selection effects *within* the model, simultaneously with the inference Enables direct, probabilistic relationships between theory and observations Framework for model comparisonFurther ReadingDeGroot & Schervish, Probability and Statistics (Solid fundamentals) Gelman, Carlin, Stern, & Rubin, Bayesian Data Analysis (In-depth; advanced topics) Loredo 2013; (Few-page intro/overview of multi-level Modeling in astronomy) Kelly 2007 (HBM for linear regression, also applied to quasars)Loredo & Wasserman, 1998 (Multi-level model for luminosity distribution of gamma ray bursts) Mandel et al.

8 2009 (HBM for Supernovae) Hogg et al. 2010 (HBM with importance sampling for exoplanet eccentricities) Andreon & Hurn, 2010 (HBM for galaxy clusters) Martinez 2015 (HBM for Milky Way satellites) Wolfgang, Rogers, & Ford 2016 (HBM for exoplanet mass-radius relationship)Introductory/General:Some applications.


Related search queries