Example: confidence

Statistical Models in R

Statistical ModelsStatistical Models in RSome ExamplesSteven BuechlerDepartment of Mathematics276B Hurley Hall; 1-6233 Fall, 2007 Statistical ModelsOutlineStatistical ModelsStructure of Models in RModel Assessment (Part IA)Anova in RStatistical ModelsStatistical ModelsFirst PrinciplesIn a couple of lectures the basic notion of a Statistical model isdescribed. Examples of anova and linear regression are given,including variable selection to find a simple but explanatory is placed onR s framework for Statistical ModelsGeneral Problemaddressed by modellingGiven: a collection of variables, each variable being a vector ofreadings of a specific trait on the samples in an : In what way does a variableYdepend on other variablesX1.

Model Comparison Statistics Normally, the models are nested in that the variables in M 0 are a subset of those in M 1. The statistic often involves the RSS values for both models, adjusted by the number of parameters used. In linear regression this becomes an anova test (comparing variances). More robust is a likelihood ratio test for nested ...

Tags:

  Model, Comparison, Statistical, Regression, Statistical models in r

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Advertisement

Transcription of Statistical Models in R

1 Statistical ModelsStatistical Models in RSome ExamplesSteven BuechlerDepartment of Mathematics276B Hurley Hall; 1-6233 Fall, 2007 Statistical ModelsOutlineStatistical ModelsStructure of Models in RModel Assessment (Part IA)Anova in RStatistical ModelsStatistical ModelsFirst PrinciplesIn a couple of lectures the basic notion of a Statistical model isdescribed. Examples of anova and linear regression are given,including variable selection to find a simple but explanatory is placed onR s framework for Statistical ModelsGeneral Problemaddressed by modellingGiven: a collection of variables, each variable being a vector ofreadings of a specific trait on the samples in an : In what way does a variableYdepend on other variablesX1.

2 ,Xnin the : A Statistical model defines a mathematicalrelationship between theXi s andY. The model is a representationof the realYthat aims to replace it as far as possible. At least themodel should capture the dependence ofYon theXi sStatistical ModelsGeneral Problemaddressed by modellingGiven: a collection of variables, each variable being a vector ofreadings of a specific trait on the samples in an : In what way does a variableYdepend on other variablesX1,..,Xnin the : A Statistical model defines a mathematicalrelationship between theXi s andY. The model is a representationof the realYthat aims to replace it as far as possible.

3 At least themodel should capture the dependence ofYon theXi sStatistical ModelsGeneral Problemaddressed by modellingGiven: a collection of variables, each variable being a vector ofreadings of a specific trait on the samples in an : In what way does a variableYdepend on other variablesX1,..,Xnin the : A Statistical model defines a mathematicalrelationship between theXi s andY. The model is a representationof the realYthat aims to replace it as far as possible. At least themodel should capture the dependence ofYon theXi sStatistical ModelsThe Types of Variablesin a Statistical modelThe response variable is the one whose content we are trying tomodel with other variables, called the explanatory any given model there is one response variable (Yabove) andthere may be many explanatory variables (likeX1.)

4 Xn). Statistical ModelsIdentify and Characterize Variablesthe first step in modelling Which variable is the response variable; Which variables are the explanatory variables; Are the explanatory variables continuous, categorical, or amixture of both; What is the nature of the response variable is it acontinuous measurement, a count, a proportion, a category, ora time-at-death? Statistical ModelsIdentify and Characterize Variablesthe first step in modelling Which variable is the response variable; Which variables are the explanatory variables; Are the explanatory variables continuous, categorical, or amixture of both; What is the nature of the response variable is it acontinuous measurement, a count, a proportion, a category, ora time-at-death?

5 Statistical ModelsIdentify and Characterize Variablesthe first step in modelling Which variable is the response variable; Which variables are the explanatory variables; Are the explanatory variables continuous, categorical, or amixture of both; What is the nature of the response variable is it acontinuous measurement, a count, a proportion, a category, ora time-at-death? Statistical ModelsIdentify and Characterize Variablesthe first step in modelling Which variable is the response variable; Which variables are the explanatory variables; Are the explanatory variables continuous, categorical, or amixture of both; What is the nature of the response variable is it acontinuous measurement, a count, a proportion, a category, ora time-at-death?

6 Statistical ModelsTypes of Variables Determine Type of ModelThe explanatory variablesAll explanatory variables continuousRegressionAll explanatory variables categoricalAnalysis of variance (Anova)Explanatory variables both continuousAnalysis of covarianceand categorical(Ancova) Statistical ModelsTypes of Variables Determine Type of ModelThe response variable what kind of data is it?ContinuousNormal regression , Anova, AncovaProportionLogistic regressionCountLog linear modelsBinaryBinary logistic analysisTime-at-deathSurvival analysisStatistical ModelsModel FormulasWhich variables are involved?

7 A fundamental aspect of Models is the use of model formulas tospecify the variables involved in the model and the possibleinteractions between explanatory variables included in the model formula is input into a function that performs a linearregression or anova, for a model formula bears some resemblance to a mathematicalformula, the symbols in the equation mean different things thanin ModelsCommon Featuresof model formulasModel formulas have a format like> Y ~ X1 + X2 + Z * WwhereYis the explanatory variable, means is modeled as afunction of and the right hand side is an expression in theexplanatory ModelsFirst Examples of model FormulasGiven continuous variablesxandy, the relationship of a linearregression ofyonxis described as> y ~ xThe actual linear regression is executed by> fit <- lm(y ~ x) Statistical ModelsFirst Examples of model FormulasIfyis continuous andzis categorical we use the same modelformula> y ~ zto express that we ll modelyas a function ofz, however in thiscase the model will be an anova, executed as> fit <- aov(y ~ z)

8 Statistical ModelsMultiple Explanatory VariablesFrequently, there are multiple explanatory variables invovled in amodel. The + symbol denotes inclusion of additional explanatoryvariables. The formula> y ~ x1 + x2 + x3denotes that y is modeled as a function ofx1, x2, x3. If all ofthese are continuous,> fit <- lm(y ~ x1 + x2 + x3)executes a mutliple linear regression ofyonx1, x2, ModelsOther Operators in model FormulasIn complicated relationships we may need to include interactionterms as variables in the model . This is common when a modelinvolves multiple categorical explanatory variables.

9 A factorialanova may involve calculating means for the levels of variableArestricted to a level ofB. The formula> y ~ A * Bdescribes this form of ModelsJust the BasicsHere, just the basic structure of modeling inRis given, usinganova and linear regression as examples. See the Crawley booklisted in the syllabus for a careful introduction to Models of giving examples of Models of these simple forms, tools forassessing the quality of the Models , and comparing Models withdifferent variables will be ModelsOutlineStatistical ModelsStructure of Models in RModel Assessment (Part IA)Anova in RStatistical ModelsApproximateYThe goal of a model is to approximate a vectorYwith valuescalculated from the explanatory variables.

10 Suppose theYvaluesare (y1,..,yn). The values calculated in the model are called thefitted values and denoted ( y1,.., yn). (In general, a hat on aquantity means one approximated in a model or throughsampling.)The goodness of fit is measured with the residuals, (r1,..,rn),whereri=yi ModelsMeasure of ResidualsTwo obtain a number that measures the overall size of theresiduals we use the residual sum of squares, defined asRSS=n i=1(yi yi) ybecomes a better approximation ModelsResiduals: Only Half of the StoryA good model should have predictive value in other data sets andcontain only as many explanatory variables as needed for areasonable minimizeRSSwe can set yi=yi, for 1 i n.


Related search queries