THE USE OF PROPENSITY SCORE MATCHING IN THE …

Working Paper Number 4 THE USE OF PROPENSITY SCORE MATCHING IN THEEVALUATION OF active labour MARKETPOLICIESTHE USE OF PROPENSITY SCORE MATCHING INTHE evaluation OF active labour MARKETPOLICIESA study carried out on behalf of the Department for Workand PensionsByAlex Bryson, Richard Dorsett and Susan PurdonPolicy Studies Institute and National Centre for Social Research Crown copyright 2002. Published with permissionof the Department Work and Pensions on behalf ofthe Controller of Her Majesty s Stationary text in this report (excluding the Royal Arms and Departmentallogos) may be reproduced free of charge in any format or mediumprovided that it is reproduced accurately and not used in a misleadingcontext. The material must be acknowledged as Crown copyright andthe title of the report specified. The DWP would appreciate receivingcopies of any publication that includes material taken from this queries relating to the content of this report and copies ofpublications that include material from this report should be sent to: PaulNoakes, Social Research Branch, Room 4-26 Adelphi, 1-11 John AdamStreet, London WC2N 6 HTFor information about Crown copyright you should visit the Her Majesty sStationery Office (HMSO) website at: Published 2002 ISBN 1 84388 043 1 ISSN 1476 3583 AcknowledgementsWe would like to thank Liz Rayner, Mike Daly and other analysts in the Departmentfor Work and Pensions for their advice, assistance and support in writing this are also particularly grateful to Professor Jeffrey Smith at the University ofMaryland for extensive and insightful comments on previous versions of this authorsAlex Bryson is a Principal Research Fellow at the Policy Studies Institute.

He hasrecently been involved in the evaluations of the New Deal for Young People and ONEand is currently evaluating the impact of tax credits on Dorsett is also a Principal Research Fellow at the Policy Studies Institute. Hehas recently been involved in evaluations of the New Deal for Young People, JointClaims for JSA, New Deal for Partners and is currently evaluating Work-BasedLearning for Purdon is the Director of the Survey Methods Centre at the National Centre forSocial Research. She is currently involved in the evaluation of the National NewDeal for Lone Parents and feasibility work for the evaluation of a job retention andrehabilitation used in this paperAll acronyms are explained in the : average treatment effectCAB: Citizens Advice BureauxCIA: conditional independence assumptionCSR: common support requirementDiD: difference-in-differencesEGW: extended GatewayEMP: employment subsidy under the New Deal for Young PeopleETF: Environment Task ForceFTET: full-time education and training option under the New Deal for Young PeopleIV: instrumental variableJSA: Jobseeker s AllowanceLATE: local average treatment effectNDLP: New Deal for Lone ParentsNDLTU: New Deal for the Long Term UnemployedNDYP: New Deal for Young PeoplePSM: PROPENSITY SCORE matchingTT: treatment on the treatedVS.

Voluntary sector option under the New Deal for Young PeopleContentsPage1 Introduction12 The evaluation Problem what it is and how to tackle Why there is a Solutions to the problem? The experimental Non-experimental approaches73 Data Requirements for PROPENSITY SCORE Matching134 Which Estimator Works? Advantages and disadvantages of PSM relative to other Questions that PSM evaluations can answerand those they cannot Programme implementation models215 Implementing MATCHING Estimators Estimating programme Performing the Assessing the performance of the Considerations when using survey data296 Practical considerations in using PROPENSITY SCORE MATCHING to rule out Designing for a PSM Examples of programme evaluations using The evaluation of The evaluation of The evaluation of NDLP437 Summary and purpose of labour market policy evaluation is to assess whether changes inpolicy, and the introduction of new programmes, influence outcomes such asemployment and earnings for those subject to the policy change.

As one analyst hasrecently noted: The task of evaluation research lies in devising methods to reliablyestimate [the impact of policy change], so that informed decisions about programmeexpansion and termination can be made (Smith, 2000: 1).Although UK governments have always intervened in labour markets in pursuit ofdesirable policy objectives, the evaluation of avowedly active labour marketprogrammes began in earnest with the Restart programme in the late 1980s. Restartrequired unemployed claimants to attend regular work-focused interviews with apersonal adviser, and might be viewed as the precursor to work-focused interviewswhich now feature in the New Deal programmes, ONE and now Jobcentre Plus. Theevaluation of the scheme involved randomly assigning a small group of claimants outof the programme, and comparing their subsequent labour market experiences withthose of Restart participants. Although this experimental evaluation method, knownas random assignment, is generally viewed as the most reliable method for estimatingprogramme effects, it has its drawbacks, one of which is the ethical issue in denyingassistance to claimants which they might benefit from.

Another drawback namelythe problems in using survey data from randomly assigned individuals when there hasbeen substantial post-assignment attrition before the survey resulted in theevaluators using non-experimental methods to analyse the data (White and Lakey,1992). Since then, very few British evaluations of active labour market programmeshave used random assignment but rather have relied on non-experimental data. Theadequacy of these techniques was called into question in the United States in the1980s (for example, LaLonde, 1986). The salience of the random assignmentapproach in the United States allowed analysts to compare results from experimentaland non-experimental techniques using the experimental data to test the bias in non-experimental results. In general, results have not been particularly favourable to non-experimental approaches. However, the research identified circumstances in whichparticular non-experimental approaches perform of the evaluation literature in the United States and Britain now focuses on thevalue of deploying various non-experimental approaches, and the data requirementsthat must be satisfied in order to countenance their usage.

One of these techniques,known as PROPENSITY SCORE MATCHING , is the subject of this paper. Although thetechnique was developed in the 1980s (Rosenbaum and Rubin, 1983) and has its rootsin a conceptual framework which dates back even further (Rubin, 1974), its use inlabour market policy evaluation only became established in the late 1990s. It gainedparticular prominence following the work of Dehijia and Wahba (1998, 1999) who, inreanalysing a sub-set of the data used in LaLonde s (1986) seminal work which hadestablished the superiority of experimental estimators, indicated that PROPENSITY scorematching performed extremely well. This work has subsequently been criticised instudies which show that PROPENSITY SCORE MATCHING , like other non-experimentaltechniques, depend critically on maintained assumptions about the nature of theprocess by which participants select into a programme, and the data available to theanalyst (Smith and Todd, 2000; Heckman et al.)

, 1998; Heckman and Todd, 2000;Agodini and Dynarski, 2001). Nevertheless, the technique continues to attract2attention as a useful evaluation tool in the absence of random assignment. As weshall see, the method has an intuitive appeal arising from the way it mimics randomassignment through the construction of a control group post hoc. Results are readilyunderstood, but less so the assumptions underpinning the validity of the approach andthe circumstances in which those assumptions are aim of this report is to provide a largely intuitive understanding of the relevanceof PROPENSITY SCORE MATCHING to evaluation research. As such, it can be seen to benested within the broader paper by Purdon (2002) which provides an overview of therange of established evaluation techniques. Some overlap with this earlier report isinevitable, however, and in order to achieve a well-rounded report, the assumptionsunderlying the other techniques will also be presented.

This is justified sinceimportant extensions to MATCHING include combining it with other techniques. Whereappropriate, the issues discussed in this report will be illustrated using the results ofrecent evaluations, with a particular focus on evaluations in the UK. Several surveysof evaluation techniques already exist. These include Heckman et al. (1999), Blundelland Costa Dias (2000), Smith (2000) and Vella (1998). These sources are all usedextensively in the remainder of this report. However, with the exception of Smith(2000) they are all heavy-reading for those not familiar with econometrics. Thecontribution of this synthesis is to translate the results into a more generallyunderstandable report. This is very timely since the huge academic effort currentlybeing directed to the development of evaluation methodologies, and the particularpopularity of MATCHING , means that the literature is extremely format for the report is as follows.

Section Two describes the evaluation problemand the array of techniques analysts use to tackle it, including MATCHING . SectionThree identifies the data requirements for PROPENSITY SCORE MATCHING . Section Four,entitled Which estimator works? outlines the advantages and disadvantages ofpropensity SCORE MATCHING relative to other evaluation techniques. It also identifiesthe questions that PSM evaluations can answer and those they cannot answer. SectionFive describes the stages that the analyst goes through in implementing matchingestimators, explaining the difficult choices the evaluator has to make in the course ofthe Six gives practical guidance to commissioners of evaluation research as to thecircumstances when PSM may be used, and the situations in which it may beappropriate to rule out PSM as an option. To date, few labour market policyevaluations in the UK have been specifically designed for PSM and it is morecommon for PSM to be used as a method of secondary analysis.

One key exception isthe evaluation of the national extension of the New Deal for Lone Parents (NDLP)where the wish to apply PSM methods was the driving force behind the evaluationdesign. The implications of designing for PSM are set out in Section The designfor the evaluation of NDLP is described in Section , alongside examples from twoother UK programme evaluations. Finally, Section Seven summarises key points fromeach evaluation Problem what it is and how to tackle there is a problemTo illustrate ideas, imagine we are interested in measuring the effect of a voluntarytraining programme on the chances of finding work. At the individual level, weobserve the labour market outcomes of those who receive the training and we observethe labour market outcomes of those who do not receive the training. To truly knowthe effect of the training on a participating individual, we must compare the observedoutcome with the outcome that would have resulted had that person not participated inthe training programme.

THE USE OF PROPENSITY SCORE MATCHING IN THE …

Tags:

Information

Transcription of THE USE OF PROPENSITY SCORE MATCHING IN THE …

Related search queries

THE USE OF PROPENSITY SCORE MATCHING IN THE …

Tags:

Information

Documents from same domain

Related documents

Related search queries