Example: confidence

STAT 8200 — Design and Analysis of Experiments for ...

STAT 8200 Design and Analysis of Experimentsfor research Workers lecture NotesBasics of Experimental DesignTerminologyResponse (Outcome, Dependent) Variable:(y) The variable who sdistribution is of interest. Could be quantitative (size, weight, etc.) or qualitative (pass/fail,quality rated on 5 point scale). I ll assume the former (easier to analyze). Typically interested in mean ofyand how it depends on othervariables. , differences in mean response between varieties, (Predictor, Independent) Variables:(x s) Variablesthat explain (predict) variablility in the response variable. , variety, rainfall, predation, soil type, subject :A set of related treatments or classifications used as an explana-tory variable. Often qualitative ( , variety), but can be quantitative (0, 100, or200 units fertilizer).Treatment or Treatment Combination:A particular combination ofthe levels of allof the treatment Variables:Other variables that influence the response variablebut are not of interest.

STAT 8200 — Design and Analysis of Experiments for Research Workers — Lecture Notes Basics of Experimental Design Terminology Response (Outcome, Dependent) Variable: (y) The variable who’s distribution is of interest. • Could be quantitative (size, weight, etc.) or qualitative (pass/fail, quality rated on 5 point scale).

Tags:

  Lecture, Research, Analysis, Design, Worker, Experiment, Of experiments, Analysis of experiments for research workers lecture

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Transcription of STAT 8200 — Design and Analysis of Experiments for ...

1 STAT 8200 Design and Analysis of Experimentsfor research Workers lecture NotesBasics of Experimental DesignTerminologyResponse (Outcome, Dependent) Variable:(y) The variable who sdistribution is of interest. Could be quantitative (size, weight, etc.) or qualitative (pass/fail,quality rated on 5 point scale). I ll assume the former (easier to analyze). Typically interested in mean ofyand how it depends on othervariables. , differences in mean response between varieties, (Predictor, Independent) Variables:(x s) Variablesthat explain (predict) variablility in the response variable. , variety, rainfall, predation, soil type, subject :A set of related treatments or classifications used as an explana-tory variable. Often qualitative ( , variety), but can be quantitative (0, 100, or200 units fertilizer).Treatment or Treatment Combination:A particular combination ofthe levels of allof the treatment Variables:Other variables that influence the response variablebut are not of interest.

2 , rainfall, level of predation, subject age. Systematic biasoccurs when treatments differ with respect to anuisance variable. If so, it becomes aconfounding Units:The units or objects that are independently assignedto a specific experimental condition. , a plot assigned to receive a particular variety, a subject assigneda particular Units:The units or objects on which distinct measure-ments of the response are made. Not necessarily same as exp tal units. Distinction isveryimportant! , a plant or fruit within a Error:variation among experimental units that have re-ceived the same experimental conditions. The standard against which differences between treatments are tobe judged. Treatment differences must be largerelative tothe variability wewould expect in the absence of a treatment effect (experimental er-ror) to infer the difference is real (statistical significance). If two varieties have mean yields that differ bydunits, no way tojudge how largedis unless we can estimate the experimental error(requires replication).

3 2 Elements of Experimental Design :(1)Randomization:Allocate the experimental units to treatments atrandom (by chance). Makes the treatment groupsprobabilistically alikeonallnui-sance factors, thereby avoiding systematic 1:Two varieties (A,B) are to be compared withrespect to crop yield. Suppose a crop row consisting of 100plants is divided into plots of 10 plants. The two varieties areassigned to plots systematically, with variety A in every otherplot:A B A B A B A B A B Suppose there is a fertility gradient along this row. Then evenif the varieties are equivalent, this we will observe better yieldin variety :Example 2:A greenhouse has 12 benches on which plants oftwo varieties are to be grown. Suppose we assign as follows:AABBAABBAABBW hereas if we use aCompletely Randomized Design (CRD):BAAABABABABB Randomization will tend toneutralize all nuisance variables.

4 Also induces statistical independence among experimental (2)Replication:Repeating the experimental run (one entire set ofexperimental conditions) using additional similar, independent, ex-perimental units. Allows estimation of the experimental error without whichtreatment differences CANNOT be inferred. Increases the precision/power of the OF PSEUDO-REPLICATION!Example 3:Suppose we randomize plots in a crop row to twotreatments as so:A B B B A B A A A BAnd we measure the size of all 10 plants in each plot. we have 50 measurements per treatment. we have 5 plots per :What s the sample size per treatment that determines ourpower to statistically distinguish varieties A and B?A:5/treatment not 50. The experimental unit here is the plot,not the : Taking multiple measurements per experimental unit is calledsub-samplingorpseudo-replication. It is a useful means to reduce measurement error in character-izing the response at the experimental unit level.

5 If not interested in estimating this measurement error, easiestanalysis is to average the subsamples in each experimental unitand analyze these averages as the data . How to allocate resources between experimental units and mea-surements units complicated, but generally more bang for addingexperimental units over measurements units. Little gains to gobeyond 2 or 3 :What determines number of replicates? Available resources. Limitations on cost, labor, time, experi-mental material available, etc. Sources of variability in system and their magnitudes. Size of the difference to be detected. Required significance level ( = 0:05?) Number of treatmentsEffect of number of replicates/treatment on smallest difference intreatment means that can be detected at =:05 in a simple one-way Design :Diff b/w means necessary for significance at level .05n per treatmentDifference in means24681012142345677 Replication: For a given desired probability of detecting a treatment effect ofinterest and under specific assumptions regarding the factors fromthe previous page, one can compute the sample size necessary.

6 These calculations are most easily done with the help of samplesize/power Analysis software SAS Analyst, NQuery Advisor, Online calculators like Russ Lenth s: rlenth/Power/). They also require detailed assumptions which are best made on thebasis of past or preliminary data. This is the hard (3)Blocking:To detect treatment differences, we d like the experimen-tal units to be as similar (homogeneous) as possible except for thetreatment received. We must balance this against practical considerations, and thegoal these reasons, the experimentalunits must be somewhat heterogeneous. The idea of blocking is to divide experimental units into homo-geneous subgroups (orblocks) within which all treatments areobserved. Then treatment comparisons can be made betweensimilar units in the same block. Reduces experimental error and increases the precision (power,sensitivity) of an :Example:In our greenhouse example, our completely ran-domized assignment of varieties A, B happened to assign va-riety B to all three benches on the west end of the the heater is located on the west end so that there isa temperature gradient from west to identifiable source of heterogeneity among the experimen-tal units is a natural choice of blocking variable.

7 Solution: re-arrange the benches as so and assign both treatments randomlywithin each column:ABBABBBAABAA Design above is called aRandomized Complete Block De-sign (RCBD). Blocking is often done within a given site. , a large field isblocked to control for potential heterogeneity within the is useful, but only slightly, as it results in only small gainsrelative to a CRD. However, if a known source of variability exists where there islikely to be a gradient in the characteristics of the plots, thenblocking within a site is definitely : In the presence of a gradient, plots should be oriented as follows*:and, if blocking is done: Placement of blocks should take into account physical features of thesite:* Plots on this page from Petersen (1994).11 Blocking:There are a number of factors that often create heterogeneity in the ex-perimental units that can, and typically should, form the basis of blocks: Region.

8 Time (season, year, day). Facility ( , if multiple greenhouses are to be used, or multiple labsto take measurements, if patients recruited from multiple clinics). Personnel to conduct the effects are typically assumed not to interaction with treatment ef-fects. , while we allow that a region (block) effect might raise the yieldfor all varieties, we assume that differences in mean yield betweenvarieties are the same in each region. If each treatment occurs just once in each block, this assumptionMUST be made in order to analyze the data and the Analysis willlikely be wrong if this assumption is violated. If treatment effects are expected to differ across blocks, use 2 reps of every treatment within each block, and consider whether the blocking factor is more appropriately con-sidered to be a treatment (4)Use of Factorial Treatment more than one treat-ment factor is to be studied, generally better to study them in com-bination, rather than orcrossedtreatment factors:Variety (A)Fertilized?

9 (B)No (B1)Yes (B2)A1A1,B1A1,B2A2A2,B1A2,B2 Increases generalizability, efficiency. Allowsinteractionsto be Treatment Structure and Interactions: An interaction occurs when the effect of one treatment factor differsdepending on the level at which the other treatment factor(s) A1 Variety A2No interaction between A and BBResponseUnfertilizedFertilizedUnfertil izedFertilizedVariety A1 Variety A2 Quantitative synergistic interactionBResponseUnfertilizedFertiliz edUnfertilizedFertilizedVariety A1 Variety A2 Quantitative inhibitory interactionBResponseUnfertilizedFertiliz edUnfertilizedFertilizedVariety A1 Variety A2 Qualitative interaction between A and BB14(5)Balance:A balanced experiment is one in which the replication isthe same under each level of each experimental , an experiment with three treatment groups each with10 experimental units is balanced; an experiment with threetreatment groups of sizes 2, 18, and 10 is , the response in a certain treatment or compar-isons with a certain treatment are of particular interest.

10 If so,then extra replication in that treatment may be justified. Oth-erwise, it is desirable to achieve as much balance as possiblesubject to practical constraints. Increases power of experiment . Simplifies statistical (6)Limiting Scope/Use of Sequential Experimentation:Largeexperiments with many factors and many levels of the factors are hard to perform, hard to analyze, and hard to interpret. If the effects of several factors are of interest, best to do severalsmall Experiments and build up to an understanding of theentire system. Can either use several factors each at a small number ( , 2)of levels, or can do sequential Experiments each examining asubset of factors (2 or 3).16(7)Adjustment for Covariates:Nuisance variables other than theblocking factors that affect the response are often measured andcompensated for in the Analysis . , measure and adjust for rainfall differences in statisticalanalysis (can t block by rainfall).


Related search queries