Example: quiz answers

Chapter 4 Introduction to Categorical Data Analysis Procedures

Chapter 4 Introduction to Categorical DataAnalysis ProceduresChapter ContentsOVERVIEW..71 SAMPLING FRAMEWORKS AND DISTRIBUTION ASSUMPTIONS..73 Simple Random Sampling: One Population..73 Stratified Simple Random Sampling: Multiple Populations..74 Observational data : Analyzing the Entire Population..75 Randomized Experiments..76 Relaxation of Sampling Assumptions..77 COMPARISON OF FREQ AND CATMOD Procedures ..77 COMPARISON OF CATMOD, GENMOD, LOGISTIC, AND PROBITPROCEDURES..78 Logistic Regression..79 Parameterization..80 REFERENCES..8170 Chapter 4. Introduction to Categorical data Analysis ProceduresChapter 4 Introduction to Categorical DataAnalysis ProceduresOverviewSeveral Procedures in SAS/STAT software can be used for the Analysis of categoricaldata:CATMOD fits linear models to functions of Categorical data , facilitating suchanalyses as regression, Analysis of variance, linear modeling, log.

Introduction to Multivariate Procedures, and PROC TRANSREG is summarized in Chapter 2, Introduction to Regression Procedures. A categorical variable is dened as …

Tags:

  Analysis, Introduction, Data, Procedures, Categorical, Introduction to categorical data analysis procedures

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Chapter 4 Introduction to Categorical Data Analysis Procedures

1 Chapter 4 Introduction to Categorical DataAnalysis ProceduresChapter ContentsOVERVIEW..71 SAMPLING FRAMEWORKS AND DISTRIBUTION ASSUMPTIONS..73 Simple Random Sampling: One Population..73 Stratified Simple Random Sampling: Multiple Populations..74 Observational data : Analyzing the Entire Population..75 Randomized Experiments..76 Relaxation of Sampling Assumptions..77 COMPARISON OF FREQ AND CATMOD Procedures ..77 COMPARISON OF CATMOD, GENMOD, LOGISTIC, AND PROBITPROCEDURES..78 Logistic Regression..79 Parameterization..80 REFERENCES..8170 Chapter 4. Introduction to Categorical data Analysis ProceduresChapter 4 Introduction to Categorical DataAnalysis ProceduresOverviewSeveral Procedures in SAS/STAT software can be used for the Analysis of categoricaldata:CATMOD fits linear models to functions of Categorical data , facilitating suchanalyses as regression, Analysis of variance, linear modeling, log-linear modeling, logistic regression, and repeated measures anal-ysis.

2 Maximum likelihood estimation is used for the Analysis oflogits and generalized logits, and weighted least squares analysisis used for fitting models to other response functions. Iterative pro-portional fitting (IPF), which avoids the need for parameter esti-mation, is available for fitting hierarchical log-linear models whenthere is a single simple and multiple correspondence analyses, using acontingency table, Burt table, binary table, or raw categoricaldata as input. For more on PROC CORRESP, seeChapter 5, Introduction to Multivariate Procedures , andChapter 24, TheCORRESP Procedure.

3 FREQ builds frequency tables or contingency tables and can produce nu-merous statistics. For one-way frequency tables, it can performtests for equal proportions, specified proportions, or the binomialproportion. For contingency tables, it can compute various testsand measures of association and agreement including chi-squarestatistics, odds ratios, correlation statistics, Fisher s exact test forany size two-way table, kappa, and trend tests. In addition, itperforms stratified Analysis , computing Cochran-Mantel-Haenszelstatistics and estimates of the common relative risk. Exactp-valuesand confidence intervals are available for various test statistics generalized linear models with maximum-likelihood family includes logistic, probit, and complementary log-logregression models for binomial data , Poisson and negative bino-mial regression models for count data , and multinomial models forordinal response data .

4 It performs likelihood ratio and Wald testsfor type I, type III, and user-defined contrasts. It analyzes repeatedmeasures data with generalized estimating equation (GEE) Chapter 4. Introduction to Categorical data Analysis ProceduresLOGISTIC fits linear logistic regression models for discrete response data withmaximum-likelihood methods. It provides four variable selectionmethods and computes regression diagnostics. It can also per-form stratified conditional logistic regression Analysis for binaryresponse data and exact conditional regression Analysis for binaryand nominal response data .

5 The logit link function in the logis-tic regression models can be replaced by the probit function or thecomplementary log-log models with probit, logit, or complementary log-log links forquantal assay or other discrete event data . It is mainly designedfor dose-response Analysis with a natural response rate. It com-putes the fiducial limits for the dose variable and provides variousgraphical displays for the Procedures that perform analyses for Categorical data are the TRANSREGand PRINQUAL PRINQUAL is summarized inChapter 5, Introduction to Multivariate Procedures , and PROC TRANSREG is summarizedinChapter 2, Introduction to Regression Procedures .

6 Acategorical variableis defined as one that can assume only a limited number ofdiscrete values. The measurement scale for such a variable is unrestricted. It can benominal, which means that the observed levels are not ordered. It can beordinal,which means that the observed levels are ordered in some way. Or it can beinterval,which means that the observed levels are ordered and numeric and that any intervalof one unit on the scale of measurement represents the same amount, regardless ofits location on the scale. One example of a Categorical variable is litter size; anotheris the number of times a subject has been married.

7 A variable that lies on a nominalscale is sometimes called aqualitativeorclassification data result from observations on multiple subjects where one or morecategorical variables are observed for each subject. If there is only one categoricalvariable, then the data are generally represented by afrequency table, which lists eachobserved value of the variable and its frequency of there are two or more Categorical variables, then a subject sprofileis defined asthe subject s observed values for each of the variables. Such Categorical data can berepresented by a frequency table that lists each observed profile and its frequency there are exactly two Categorical variables, then the data are often represented bya two-dimensionalcontingency table, which has one row for each level of variable 1and one column for each level of variable 2.

8 The intersections of rows and columns,calledcells, correspond to variable profiles, and each cell contains the frequency ofoccurrence of the corresponding there are more than two Categorical variables, then the data can be represented byamultidimensional contingency table. There are two commonly used methods fordisplaying such tables, and both require that the variables be divided into two Random Sampling: One Population 73In the first method, one set contains a row variable and a column variable for a two-dimensional contingency table, and the second set contains all of the other variables in the second set are used to form a set of profiles.

9 Thus, the dataare represented as a series of two-dimensional contingency tables, one for each pro-file. This is the data representation used by PROC FREQ. For example, if you re-quest tables for RACE*SEX*AGE*INCOME, the FREQ procedure represents thedata as a series of contingency tables: the row variable is AGE, the column variableis INCOME, and the combinations of levels of RACE and SEX form a set of the second method, one set contains the independent variables, and the other setcontains the dependent variables. Profiles based on the independent variables arecalledpopulation profiles, whereas those based on the dependent variables are calledresponse profiles.

10 A two-dimensional contingency table is then formed, with onerow for each population profile and one column for each response profile. Since anysubject can have only one population profile and one response profile, the contingencytable is uniquely defined. This is the data representation used by PROC Frameworks and DistributionAssumptionsThis section discusses the sampling frameworks and distribution assumptions for theCATMOD and FREQ Random Sampling: One PopulationSuppose you take a simple random sample of 100 people and ask each person thefollowing question: Of the three colors red, blue, and green, which is your favorite?


Related search queries