OPTIONAL MINITAB PROJECT/HOMEWORK

I. Probabilitycalculations using MINITAB.

Background and ButtonPushing Recallthat thecumulative distribution function, or cdf , for a random variable X is

definedto beF( x ) = P ( X#x ) .There are two common sorts of calculations one

makes: given avalue xone findsthe correspondingvalue ofF ; or , given a particular

valueof F, that is, given a particularprobability , one findsthe corresponding value of x.

( Interms of functions, in the latter case one is evaluating theinversecumulative distribution

function. )

Forthe case of a standard normal random variable, there is a MINITAB function that provides

analternative to using tables to do calculations. To get to this function one uses the tab

CALC( and in the two successive dropdown boxes ) PROBABILITY DISTRIBUTIONS , andNORMAL. (A mean of zero and a standard deviation of one are the default settings

forNORMAL ) .Then one chooses‘cumulative probability’ or ‘inverse

cumulativeprobability’ as appropriate.

Required Calculations

Headcolumn one in the MINITAB worksheetPROB ;head column three in the MINITABworksheet ZVALUE. Incolumn one enter the values 0.02 , 0.20

0.40,0.50, 0.60 , 0.80 and 0.98 . In column three enter the values3.5 , 2.5 ,

1.5,0.5 , 1.5 , 2.5 and 3.5 .Use MINITAB to find the ‘zvalues’ corresponding

tothe probabilities in column one, and store the results in column two. Use MINITAB

tofind the probabilities corresponding to ‘zvalues’ in column three, and store the

resultsin column four.(Friendly advice: I’d use a good oldfashionedtableto checkmy work ifI had to do this exercise ! )

The T Random Variable

Onemay do the same sorts of calculations for aT random variable as for a standard

normalrandom variable.One uses the tabs CALC( and in the two successive

dropdownboxes ) PROBABILITY DISTRIBUTIONS , and t.Notice that , of

course,in this case one must specify the number of degrees of freedom.

Thisis a handy function : typical ttables don’t tabulate many values, which makes

somecalculations using ttables awkward.

II .Calculating a Confidence Interval

Backgroundand Review

Recallthat ifX is a normal random variable with mean:, and with known standard deviation,Fis the sample mean based on a simple random sample of size n,, and if then a twosided confidence interval for:of size1 " is:

Recallwhat this means : the random interval

covers:1 with probability".

Recallalso that when one says that the standard deviation is ‘known’ this usually

justmeans that one has a large sample size .For the same set of hypotheses, but

forFone has a small sample size, so one can’t safely assume unknown,i.e. when

thats is close toFa twosided confidence interval for, then:of size1 " is:

Requiredcalculation

The height of male USundergraduates is normally distributed.A random

sampleof 10 male undergraduates produced the following measurements ( in inches ):

73.25, 69.5 , 69.5 , 68 , 68, 70.5 , 68 , 69 , 71 , 68.

Calculatea 96% twosided confidence interval for the mean height ( in inches ) of

USmale undergraduates.

Remarks: I have a neighbor who gets into her car every morning and backs

herautomobile out to her mailbox to pickup the newspaper from the paper box, and

thendrives back into the garage with the paper .....Puzzling, since she appears in

quiteadequate physical condition to make the arduous roundtrip odyssey of

about, oh , maybe, .... 60 feet !!!Please don’t approach this problem with

hermentality :don’t try to get MINITAB to do the entire problem for you !!!

Rather,use MINITAB to do the nasty bits , — computing the sample mean, the

samplestandard deviation, and calculating the necessary tvalue, — andthen do the

remaining10 cents worth of calculation yourself .

III. Understandingwhat a Confidence Interval Means WhetherFis known or unknown, the meaning of a confidence interval is still the same.To say one has , for instance, a 96% confidence interval for:means that

hascomputed one particular member of a family of random intervals that cover

:with probability 0.96.The following computer exercise illustrates this idea.

Preparation.

Firstgenerate 300 simple random samples of size 10 tencorresponding to observations

ofa random variable with mean 70 and standarddeviation of 2.4 ( i.e. these are simulated

observationsof male undergraduateheight in inches ) .Use the tabs CALC, RANDOM

DATA,NORMAL .Enter c1c2 c3 c4 c5 c6 c7 c8 c9 c10in the ‘store columns’

box, andremember to specify the mean to be 70 and the standard deviation to be 2.4

Headcolumn 11xbar andcolumn 12s .In column 11 store the

samplemean of each sample of size 10by using CALC, ROW STATISTICS , etc.

similarly,in column 12 store the standard deviation of each sample of size 10.

Incolumn 13 store thelowerendpoint of a 96% twosided confidence interval for

: based on each sample of size 10 , which is

, using the

CALCULATORtab .... ( note you previously calculated the necessary t value ) .

Incolumn 14 store theupper96% twosided confidence interval forendpoint of a

:based on each sample of size 10 , which is

. (You may

wantto head columns 13 and 14 ‘lower’ and ‘upper’ , or some such titles...)

Analysis

Inthis artificial, but hopefully instructive exercise, weknow the value of the

populationmean : it’s 70 , because we chose it that way !So, in the longrun

96%of all confidence intervals for:A set of, will cover the value 70.

300random samples isn’t as good an approximation to the ‘longrun’ as

3,000,000would be – but, nevertheless, the proportion of these300 intervals

thatcontain 70 shouldn’t be too far from 0.96.Compute it ....!!!!

Atechnical note :checking whethera number x is in an interval [ a , b ] is

thesame as checking whether ( x a )( x b ) is negative.... ( To see this

sketcha graph of y = ( x a )( x b ) ...)There is a function in the MINITAB

calculatorfunction menu called‘Signs’ that returns 1 if the argument of the function

isnegative, and 1 if the argument of the function is positive.Using these ideas, one

canbuild an indicator variable , maybe headed‘sign’ ? , in column 15 that is 1 if the

intervalfor the corresponding simple random sample contains 70 , and 1 ifthe interval

doesn’tcontain 70.

Thenone can count the number of intervals which contain 70 by either doing a

sort, or else by forming a subset, if one prefers that approach.

(By way of example, formy data , and , of course your data will probably be different ,

287of the300 intervals contained the population mean, 70 , for a sample proportion of

287/300or about 95.7% ....)