Example: confidence

Chapter 2 Graphical methods for presenting data

Chapter 2 Graphical methods for presenting IntroductionWe have looked at ways of collecting data and then collating them into tables. Frequency tablesare useful methods of presenting data ; they do, however, have their limitations. With largeamounts of data Graphical presentation methods are often clearer to understand. Here, we lookat methods for producing Graphical representations of dataof the types we have seen Stem and Leaf plotsStem and leaf plotsare a quick and easy way of representing data graphically. They can beused with both discrete and continuous data . The method for creating a stem and leaf plot issimilar to that for creating a grouped frequency table. The first stage, as with grouped frequencytables, is to decide on a reasonable number of intervals which span the range of data . The in-terval widths for a stem and leaf plot must be equal. Because of the way the plot works it isbest to use sensible values for the interval width 5,10, 100, 1000; if a dataset consistsof many small values, this interval width could also be 1, or even or Once we havedecided on our intervals we can construct the stem and leaf plot.

GRAPHICAL METHODS FOR PRESENTING DATA 15 Example 4: Production line data If there is more than one significant figure in the data, the extra digitsare cut (or truncated), not rounded, to the nearest value; that is to say, 2.97would become 2.9, not 3.0. To illustrate this, consider the following data on lengths of items on a production line ...

Tags:

  Data, Presenting, Presenting data

Information

Domain:

Source:

Link to this page:

Please notify us if you found a problem with this document:

Other abuse

Advertisement

Transcription of Chapter 2 Graphical methods for presenting data

1 Chapter 2 Graphical methods for presenting IntroductionWe have looked at ways of collecting data and then collating them into tables. Frequency tablesare useful methods of presenting data ; they do, however, have their limitations. With largeamounts of data Graphical presentation methods are often clearer to understand. Here, we lookat methods for producing Graphical representations of dataof the types we have seen Stem and Leaf plotsStem and leaf plotsare a quick and easy way of representing data graphically. They can beused with both discrete and continuous data . The method for creating a stem and leaf plot issimilar to that for creating a grouped frequency table. The first stage, as with grouped frequencytables, is to decide on a reasonable number of intervals which span the range of data . The in-terval widths for a stem and leaf plot must be equal. Because of the way the plot works it isbest to use sensible values for the interval width 5,10, 100, 1000; if a dataset consistsof many small values, this interval width could also be 1, or even or Once we havedecided on our intervals we can construct the stem and leaf plot.

2 This is perhaps best describedby the following data :11,12,9,15,21,25,19,8. The first step is to decide on intervalwidths one obvious choice would be to go up in10s. This would give astem unitof 10 and aleaf unitof1. The stem and leaf plot is constructed as 911 2 5 921 5 Stem Leafn= 8, stem unit= 10, leaf unit= can clearly see where the data have been put. The stem units are to the left of the verticalline, while the leaves are to the right. So, for example, our first observation 11 is made up of12 Chapter 2. Graphical methods FOR presenting DATA13a stem unit of one 10 and a leaf unit one 1. It is important to give an equal amount of space toeach leaf value by doing so, we can get a clear picture of any patterns in the data (it s almostlike a bar chart on its side but it still shows the raw observations!). Before producing a stemand leaf plot, it will probably help to first write down the data in ascending numerical 1: Percentage returns on a shareAs you might imagine, the interval width does not have to be 10.

3 The following numbers showthe percentage returns on an ordinary share for 23 consecutive months: , the largest value is and the smallest , and we have lots of decimal values inbetween. Thus, it seems sensible here to have a stem unit of 1 and a leaf unit of A stem andleaf diagram for this set of returns then might look like: 21 3 12 3 2 02 5 6 1 401 9 5 1 3 5 910 5 2 724 4 Stem Leafn= 23, stem unit= 1, leaf unit= 2: Unemployment rates in the , that should all seem fine so far. So what can go wrong? Consider the following data ,which are the percentage unemployment rates for 10 states:17 18 15 14 12 19 20 21 24 15If you were to choose 10 as the interval width ( go up in 10s), the stem and leaf plot wouldlook like12 4 5 5 7 8 920 1 4 Stem Leafn= 10, stem unit= 10, leaf unit= , the interval width is too large, resulting in only two intervals for our data .

4 With such fewintervals it is difficult to identify any patterns in the data . We can get a better idea about what isgoing on if we choose a smaller interval width say 5. Doing sogives the following stem andleaf plot: Chapter 2. Graphical methods FOR presenting DATA1412 415 5 7 8 920 1 4 Stem Leafn= 10, stem unit= 10, leaf unit= now that there are two 1s in the stem one for observations between 10 and 14 (inclusive)and another for observations between 15 and 19 (inclusive).Thus, the stem unit is still 10, butthe interval width is now only 5. Changing the interval widthlike this produces a plot whichstarts to show some sort of pattern in the data indeed, this is the intention of such graphicalpresentations. We could, however, go to the other extreme and havetoo manyintervals. Ifthis were the case, any pattern would again be lost because lots of intervals would contain noobservations at all. So choose your interval width carefully!

5 Example 3: Call centre dataLet s work through the following example. The observationsin the table below are the recordedtime it takes to get through to an operator at a telephone callcentre (in seconds).54 56 50 67 55 38 49 45 39 5045 51 47 53 29 42 44 61 51 5030 39 65 54 44 54 72 65 58 62 Stem Leafn=stem unit=leaf unit= Chapter 2. Graphical methods FOR presenting DATA15 Example 4: Production line dataIf there is more than one significant figure in the data , the extra digits arecut(or truncated), notrounded, to the nearest value; that is to say, , To illustrate this,consider the following data on lengths of items on a production line (in cm) stem and leaf plot for this is as follows:1920 2 325 6 930 438n= 10,stem unit= 1cm, leaf unit= the interval width This allows for greater clarity in the plot. Why do you thinkwecutthe extra digits?Example 5: student marksThe stem and leaf plot below represents the marks on a test for50 7 7 9 920 0 1 1 1 2 2 4 4 4 5 7 7 8 8 832 3 3 3 4 5 5 6 7 7 8 8 9 9 940 0 1 2 2 3 3 4 4 550 0 0n= 50,stem unit= 10, leaf unit= s easy to see some of the advantages of graphically presenting data .

6 For example, here youcan clearly see that the data are centred around a value in thelow 30 s and fall away on eitherside. From stem and leaf plots we can quickly and easily tell if the distribution of the datais symmetric or asymmetric. We can see whether there are anyoutliers, that is, observationswhich are either much larger or much smaller than is typical of the data . We could perhaps eventell whether the data aremulti modal, that is to say, whether there are two or more peaks onthe graph with a gap between them. If so, this could suggest that the sample contains data fromtwo or more 2. Graphical methods FOR presenting Using MinitabWith the small data sets we have seen so far, it is obviously relatively easy to create stem andleaf plots by hand. With larger data sets this would be more problematic and certainly moretime consuming. Fortunately, there are computer packages that will create these plots for us Minitabis one such package, and can be found on most university and guidance on usingMinitabwill be provided in the computer practical sessions inweek Bar ChartsBar chartsare a commonly used and clear way of presenting categoricaldata or any ungroupeddiscrete frequency example, recall the example on students modes of transport:StudentModeStudentModeStudentM ode1 Car11 Walk21 Walk2 Walk12 Walk22 Metro3 Car13 Metro23 Car4 Walk14 Bus24 Car5 Bus15 Train25 Car6 Metro16 Bike26 Bus7 Car17 Bus27 Car8 Bike18 Bike28 Walk9 Walk19 Bike29 Car10 Car20 Metro30 CarThe first logical step is to put these into a frequency table, givingModeFrequencyCar10 Walk7 Bike4 Bus4 Metro4 Train1 Total30 Chapter 2.

7 Graphical methods FOR presenting DATA17We can then present this information as a bar chart, by following the five step process shownbelow:1. First decide what goes on each axis of the chart. By convention the variable being mea-sured goes on the horizontal (x axis) and the frequency goes on the vertical (y axis).2. Next decide on a numeric scale for the frequency axis. Thisaxis represents the frequencyin each category by its height. It must start at zero and include the largest frequency. It iscommon to extend the axis slightly above the largest value soyou are not drawing to theedge of the Having decided on a range for the frequency axis we need to decide on a suitable numberscale to label this axis. This should have sensible values, for example,0,1,2, .. ,or0,10,20.., or other such values as make sense given the Draw the axes and label them Draw a bar for each category. When drawing the bars it is essential to ensure the follow-ing: the width of each bar is the same; the bars are separated from each other by equally sized gives the following bar chart:CarWalkBikeBusMetroTrain210864 FrequencyThis bar chart clearly shows that the most popular mode of transport is the car and that themetro, bus and cycling are all equally popular (in our small sample).

8 Bar charts provide asimple method of quickly spotting simple patterns of popularity within a discrete data 2. Graphical methods FOR presenting HistogramsBar charts have their limitations; for example, they cannotbe used to present continuous dealing with continuous random variables a different kind of graph is required. This iscalled ahistogram. At first sight these look similar to bar charts. There are, however, twocritical differences: the horizontal (x-axis) is a continuous scale. As a result of this there areno gaps betweenthe bars(unless there are no observations within a class interval); the height of the rectangle is only proportional to the frequency if the class intervals areall equal. With histograms it is theareaof the rectangle that is proportional to we will only consider histograms with equal classintervals. Those with uneven classintervals require more careful a histogram is much like producing a bar chart and in many respects can be con-sidered to be the next stage after producing a grouped frequency table.

9 In reality, it is often bestto produce a frequency table first which collects all the datatogether in an ordered format. Oncewe have the frequency table, the process is very similar to drawing a bar Find the maximum frequency and draw the vertical (y axis) from zero to this value,including a sensible numeric The range of the horizontal (x axis) needs to include not only the full range of observa-tions but also the full range of the class intervals from the frequency Draw a bar for each group in your frequency table. These should be the same width andtouch each other (unless there are no data in one particular class).The frequency table for the data on service times for a telephone call centre (Section ) wasService timeFrequency175 time<1801180 time<1853185 time<1903190 time<1956195 time<20010200 time<20512205 time<2108210 time<2153215 time<2203220 time<2251 Total50 Chapter 2. Graphical methods FOR presenting DATA19 The histogram for these data is:185 Time (s)190180175195200205210215220225 Frequency12810642 Histograms are useful tools in data analysis.

10 They are easy to produce inMinitabfor largedata sets and provide a clear visual representation of the data . Using histograms, it is easyto spot themodalor most popular class in the data , the one with the highest peak. Itis also easy to spot simple patterns in the data . Is the frequency distribution symmetric, asthe histograms produced above, or is it skewed to one side like the left hand histogram in thefollowing graphic? Chapter 2. Graphical methods FOR presenting DATA20 Histograms also allow us to make early judgements as to whether all our data come from thesame population. Consider the right hand histogram in the graphic above. It clearly containstwo separate modes (peaks), each of which has its own symmetric pattern of data . This clearlysuggests that the data come from two separate populations, one centred around 85 with a narrowspread and one centred around 100 with a wider spread. In realsituations it is unlikely that thedifference would be as dramatic, unless you had a poor sampling method.


Related search queries