Transcription of Multinomial Logistic Regression Models
1 Stat 544, Lecture 191 Multinomial LogisticRegression ModelsPolytomous Regression can beextended to handle responses that arepolytomous, >2 categories. (Note: The wordpolychotomousis sometimes used, but this word doesnot exist!) When analyzing a polytomous response,it s important to note whether the response isordinal(consisting of ordered categories) ornominal(consisting of unordered categories). Some types ofmodels are appropriate only for ordinal responses;other Models may be used whether the response isordinal or nominal. If the response is ordinal, we donot necessarily have to take the ordering into account,but it often helps if we do. Using the natural orderingcan lead to a simpler, more parsimonious model and increase power to detect relationships with 544, Lecture 192 If the response variable is polytomous and all thepotential predictors are discrete as well, we coulddescribe the multiway contingency table by aloglinear model.
2 But fitting a loglinear model has twodisadvantages: It has many more parameters, and many of themare not of interest. The loglinear model describesthe joint distribution of all the variables, whereasthe Logistic model describes only the conditionaldistribution of the response given the predictors. The loglinear model is more complicated tointerpret. In the loglinear model, the effect of apredictorXon the responseYis described bytheXYassociation. In a logit model, however,the effect ofXonYis a main you are analyzing a set of categorical variables, andone of them is clearly a response while the othersare predictors, I recommend that you use logisticrather than loglinear 544, Lecture 193 Grouped versus a medicalstudy to investigate the long-term effects of radiationexposure on mortality.
3 The response variable isY=8>>>>> <>>>>>:1 if alive,2 if dead from cause other than cancer,3 if dead from cancer other than leukemia,4 if dead from main predictor of interest is level of exposure(low, medium, high). The data could arrive inungrouped form, with one record per subject:low 4med 1med 2high it could arrive in grouped form:ExposureY=1Y=2Y=3Y=4low 22750medium18673high141299 Stat 544, Lecture 194 In ungrouped form, the response occupies a singlecolumn of the dataset, but in grouped form theresponse occupiesrcolumns. Most computerprograms for polytomous Logistic Regression canhandle grouped or ungrouped the data are grouped or ungrouped, we willimagine the response to be Multinomial .
4 That is, the response for rowi,yi=(yi1,yi2,..,yir)T,is assumed to have a Multinomial distribution withindexni=Prj=1yijand parameter i=( i1, i2,.., ir) the data are grouped, thenniis the total number of trials in theith row of the dataset, andyijis thenumber of trials in which outcomejoccurred. If thedata are ungrouped, thenyihas a 1 in the positioncorresponding to the outcome that occurred and 0 selsewhere, andni= 1. Note, however, that if the dataare ungrouped, we do not have to actually create adataset with columns of 0 s and 1 s; a single columncontaining the response level 1,2,..,ris 544, Lecture 195 Describing polytomous responses by asequence of binary some cases, itmakes sense to factor the response into a sequenceof binary choices and model them with a sequence ofordinary Logistic example, consider the study of the effects ofradiation exposure on mortality.
5 The four-levelresponse can be modeled in three stages:PopulationAliveDeadNon-cancer CancerOther cancer LeukemiaStage 1 Stage 2 Stage 3 Stat 544, Lecture 196 The stage 1 model, which is fit to all subjects,describes the log-odds of stage 2 model, which is fit only to the subjectsthat die, describes the log-odds of death due to cancerversus death from other stage 3 model, which is fit only to the subjectswho die of cancer, describes the log-odds of death dueto leukemia versus death due to other the Multinomial distribution can be factoredinto a sequence of conditional binomials, we can fitthese three Logistic Models separately. The overalllikelihood function factors into three approach is attractive when the response can benaturally arranged as a sequence of binary in situations where arranging such a sequence isunnatural, we should probably fit a singlemultinomial model to the entire 544, Lecture 197 Baseline-category logit thatyi=(yi1,yi2.)
6 ,yir)Thas a Multinomial distribution with indexni=Prj=1yijand parameter i=( i1, i2,.., ir) the response categories 1,2,..,rareunordered, the most popular way to relate itocovariates is through a set ofr 1 baseline-categorylogits. Takingj as the baseline category, the model islog ij ij =xTi j,j =j .Ifxihas lengthp, then this model has (r 1) pfreeparameters, which we can arrange as a matrix or avector. For example, if the last category is thebaseline (j =r), the coefficients are =[ 1, 2,.., r 1]Stat 544, Lecture 198 orvec( )=26666664 1 r on this model Thekth element of jcan be interpreted as: theincrease in log-odds of falling into categoryjversus categoryj resulting from a one-unitincrease in thekth covariate, holding the othercovariates constant.
7 Removing thekth covariate from the model isequivalent to simultaneously settingj 1coefficients to zero. Any of the categories can be chosen to be thebaseline. The model will fit equally well,achieving the same likelihood and producing thesame fitted values. Only the values andinterpretation of the coefficients will 544, Lecture 199 To calculate ifrom , the back-transformation is ij=exp(xTi j)1+Pk =j exp(xTi k)for the non-baseline categoriesj =j , and thebaseline-category probability is ij =11+Pk =j exp(xTi k).Model model is not difficult to fit byNewton-Raphson or Fisher scoring. PROCLOGISTIC can do of the estimated expected counts ij=ni ijare large enough, we can test the fit of ourmodel versus a saturated model that estimates independently fori=1.
8 ,N. The deviance forcomparing this model to a saturated one isG2=2 NXi=1rXj=1yijlogyij saturated model hasN(r 1) free parametersand the current model hasp(r 1), wherepis theStat 544, Lecture 1910 length ofxi, so the degrees of freedom aredf=(N p)(r 1).The corresponding Pearson statistic isX2=NXi=1rXj=1r2ij,whererij=yij ijp ijis the Pearson residual. If the model is true, both areapproximately distributed as 2dfprovided that no more than 20% of the ij s are below , and none are below practice this is often not satisfied, so there may beno way to assess the overall fit of the , we may still apply a 2approximation to G2and X2to compare nested Models , providedthat (N p)(r 1) is large relative to means that the actual covariancematrix ofyiexceeds that specified by the multinomialStat 544, Lecture 1911 model,V(yi)=nihDiag( i) i is reasonable to think that overdispersion is presentif the data are grouped (ni s are greater than 1)
9 , xialready contains all covariates worthconsidering, and the overallX2is substantially larger than itsdegrees of freedom (N p)(r 1).In this situation, it may be worthwhile to introduce ascale parameter 2, so thatV(yi)=ni 2hDiag( i) i usual estimate for 2is 2=X2(N p)(r 1),which is approximately unbiased if (N p)(r 1) islarge. Introducing a scale parameter does not alterthe estimate of (which then becomes aquasilikelihood estimate), but it does alter ourStat 544, Lecture 1912 estimate of the variability of . If we estimate a scaleparameter, we should multiply the estimated ML covariance matrix for by 2(SAS does this automatically); divide the usual Pearson residuals by ; and divide the usualX2,G2, X2and G2statisticsby 2(SAS reports these as scaled statistics).
10 These adjustments will have little practical effectunless the estimated scale parameter is substantiallygreater than (say, or higher). table below, reported by Delany andMoore (1987), comes from a study of the primaryfood choices of alligators in four Florida classified the stomach contents of 219captured alligators into five categories: Fish (the mostcommon primary food choice), Invertebrate (snails,insects, crayfish, etc.), Reptile (turtles, alligators),Bird, and Other (amphibians, plants, household pets,stones, and other debris).Let s describe these data by a baseline-categorymodel, with Primary Food Choice as the outcome andLake, Sex, and Size as 544, Lecture 1913 Primary Food 3 2 2 3large30123 Oklawaha Msmall22001large137600 Fsmall3 9 1 0 2large01010 TraffordMsmall37101large86635 Fsmall2 4 1 1 4large01000 GeorgeMsmall 1310022large90012 Fsmall3 9 1 0 1large81001 Because the usual primary food choice of alligatorsappears to be fish, we ll use fish as the baselinecategory.