Example: quiz answers

Item Response Theory: What It Is and How You Can Use the ...

Paper SAS364-2014 Item Response Theory: What It Is and How You Can Use the IRTP rocedure to apply ItXinming An and Yiu-Fai Yung, SAS Institute Response theory (IRT) is concerned with accurate test scoring and development of test items. Youdesign test items to measure various kinds of abilities (such as math ability), traits (such as extroversion),or behavioral characteristics (such as purchasing tendency). Responses to test items can be binary (suchas correct or incorrect responses in ability tests) or ordinal (such as degree of agreement on Likert scales).Traditionally, IRT models have been used to analyze these types of data in psychological assessmentsand educational testing.

Paper SAS364-2014 Item Response Theory: What It Is and How You Can Use the IRT Procedure to Apply It Xinming An and Yiu-Fai Yung, SAS Institute Inc.

Tags:

  Apply

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of Item Response Theory: What It Is and How You Can Use the ...

1 Paper SAS364-2014 Item Response Theory: What It Is and How You Can Use the IRTP rocedure to apply ItXinming An and Yiu-Fai Yung, SAS Institute Response theory (IRT) is concerned with accurate test scoring and development of test items. Youdesign test items to measure various kinds of abilities (such as math ability), traits (such as extroversion),or behavioral characteristics (such as purchasing tendency). Responses to test items can be binary (suchas correct or incorrect responses in ability tests) or ordinal (such as degree of agreement on Likert scales).Traditionally, IRT models have been used to analyze these types of data in psychological assessmentsand educational testing.

2 With the use of IRT models, you can not only improve scoring accuracy but alsoeconomize test administration by adaptively using only the discriminative items. These features might explainwhy in recent years IRT models have become increasingly popular in many other fields, such as medicalresearch, health sciences, quality-of-life research, and even marketing research. This paper describes avariety of IRT models, such as the Rasch model, two-parameter model, and graded Response model, anddemonstrates their application by using real-data examples. It also shows how to use the IRT procedure,which is new in SAS/STAT , to calibrate items, interpret item characteristics, and score , the paper explains how the application of IRT models can help improve test scoring and developbetter tests.

3 You will see the value in applying item Response theory, possibly in your own organization!INTRODUCTIONItem Response theory (IRT) was first proposed in the field of psychometrics for the purpose of abilityassessment. It is widely used in education to calibrate and evaluate items in tests, questionnaires, and otherinstruments and to score subjects on their abilities, attitudes, or other latent traits. During the last severaldecades, educational assessment has used more and more IRT-based techniques to develop tests. Today,all major educational tests, such as the Scholastic Aptitude Test (SAT) and Graduate Record Examination(GRE), are developed by using item Response theory, because the methodology can significantly improvemeasurement accuracy and reliability while providing potentially significant reductions in assessment timeand effort, especially via computerized adaptive testing.

4 In recent years, IRT-based models have also becomeincreasingly popular in health outcomes, quality-of-life research, and clinical research (Hays, Morales, andReise 2000; Edelen and Reeve 2007; Holman, Glas, and de Haan 2003; Reise and Waller 2009). Forsimplicity, models that are developed based on item Response theory are referred to simply as IRT modelsthroughout the paper introduces the basic concepts of IRT models and their applications. The next two sections explainthe formulations of the Rasch model and the two-parameter model. Emphases are on the conceptualinterpretations of the model parameters.

5 Extensions of the basic IRT models are then described, and somemathematical details of the IRT models are presented. Next, two data examples show the applications of theIRT models by using the IRT procedure. Compared with classical test theory (CTT), item Response theoryprovides several advantages. These advantages are discussed before the paper concludes with a IS THE RASCH MODEL?The Rasch model is one of the most widely used IRT models in various IRT applications. Suppose you haveJbinary items,X1;:::;XJ, where 1 indicates a correct Response and 0 an incorrect Response . In the Raschmodel, the probability of a correct Response is given i j1Ce i j1where iis the ability (latent trait) of subjectiand jis the difficulty parameter of itemj.

6 The probability ofa correct Response is determined by the item s difficulty and the subject s ability. This probability can beillustrated by the curve in Figure 1, which is called the item characteristic curve (ICC) in the field of IRT. Fromthis curve you can observe that the probability is a monotonically increasing function of ability. This meansthat as the subject s ability increases, the probability of a correct Response increases; this is what you wouldexpect in 1 Item Characteristic CurveAs the name suggests, the item difficulty parameter measures the difficulty of answering the item preceding equation suggests that the probability of a correct Response is for any subject whoseability is equal to the value of the difficulty parameter.

7 Figure 2 shows the ICCs for three items, with difficultyparameters of 2, 0, and 2. By comparing these three ICCs, you can see that the location of the ICC isdetermined by the difficulty parameter. To get a probability of a correct Response for these three items,the subject must have an ability of 2, 0, and 2, 2 Item Characteristic Curves2 WHAT IS THE TWO-PARAMETER MODEL?In the Rasch model, all the items are assumed to have the same shape. In practice, however, this assumptionmight not be reasonable. To avoid this assumption, another parameter called the discrimination (slope)parameter is introduced.

8 The resulting model is called the two-parameter model. In the two-parametermodel, the probability of a correct Response is given j i j1Ce j i jwhere jis the discrimination parameter for itemj. The discrimination parameter is a measure of thedifferential capability of an item. A high discrimination parameter value suggests an item that has a highability to differentiate subjects. In practice, a high discrimination parameter value means that the probability ofa correct Response increases more rapidly as the ability (latent trait) increases. Item characteristic curves ofthree items,item1,item2, anditem3, with different discrimination parameter values are shown in Figure 3 Item Characteristic CurvesThe difficulty parameter values for these three items are all 0.

9 The discrimination parameter values are ,1, and 2, respectively. In Figure 3, you can observe that as the discrimination parameter value increases, theICC becomes more steep around 0. As the ability value changes from to , the probability of a correctresponse changes from to foritem3, which is much larger thanitem1. For that reason,item3candifferentiate subjects whose ability value is around 0 more efficiently OF THE BASIC IRT MODELSE arly IRT models, such as the Rasch model and the two-parameter model, concentrate mainly on analyzingdichotomous responses that have a single latent trait. The preceding sections describe the characteristics ofthese two models.

10 Various extensions of these basic IRT models have been developed for more flexiblemodeling in different situations. The following list presents some extended (or generalized) IRT models andtheir capabilities: graded Response models (GRM), which analyze ordinal responses and rating scales three- and four-parameter models, which analyze test items that have guessing and ceiling parametersin the Response curves3 multidimensional IRT models, which analyze test items that can be explained by more than one latenttrait or factor multiple-group IRT models, which analyze test items in independent groups to study differential itemfunctioning or invariance confirmatory IRT models, which analyze test items that have hypothesized relationships with the latentfactorsThese generalizations or extensions of IRT models are not mutually exclusive.


Related search queries