The Decision Sciences Journal of Innovative Education

 

CORRELATION COEFFICIENT RULE OF THUMB

Timothy C. Krehbiel

 Professor of Decision Sciences and MIS

Richard T. Farmer School of Business

Miami University

Oxford, OH  45056  USA

krehbitc@muohio.edu

513-529-4837

 

Originally submitted to The Decision Science Journal of Innovative Education, June 3, 2003.  Revision #1 submitted September 12, 2003.

Timothy C. Krehbiel is Professor of Decision Sciences and MIS in the Richard T. Farmer School of Business.  His research interests include quality improvement, total quality environmental management, and statistics education.  His research has appeared in numerous journals including Teaching Statistics, Journal of Education for Business, Communications in Statistics, Quality Management Journal, and International Journal of Production Research.  He is co-author of four books; Sustainability Perspectives for Resources and Business, Basic Business Statistics, Business Statistics:  A First Course, and Statistics for Managers Using Microsoft Excel.  In 1996, he was one of five Miami University professors to win the Instructional Innovation Award from the Decision Sciences Institute.  In 2000 he received the Richard T. Farmer School of Business Teaching Effectiveness Award.

CORRELATION COEFFICIENT RULE OF THUMB

Introduction

Many introductory business statistics textbooks introduce the sample correlation coefficient in the descriptive statistics chapter.  Since this material is prior to the introduction of statistical inference and significance, little can be said about interpreting the coefficient.  These texts point out that a value around zero implies little or no linear relationship between the two variables, and a value near positive (negative) one implies a strong positive (negative) linear relationship between the two variables.  Although uncommon, at least one text (see Newbold, Carlson and Thorne (2003), page 63) introduces the following rule of thumb to help students decide if the observed value of the correlation coefficient is significant:

Rule of Thumb #1:        If , then a linear relationship exists.

This paper provides statistical justification for the rule’s use.  Moreover, a slight change to the rule of thumb provides an interesting insight concerning a commonly applied approximation:  When performing t-tests or constructing t-intervals, use t=2 instead of an exact value from the t-distribution.  The paper concludes that Rule of Thumb #1 allows educators to introduce correlation analysis prior to a formal introduction to statistical inference and thus provides an effective bridge between descriptive and inferential statistics.  A practical business example is included. 

True Alpha Level for the Rule of Thumb

How close a correlation value needs to be to +1 to prove statistical significance depends on the sample size.  To test whether a linear relationship exists the following hypotheses are used:

    versus    

And the rejection rule:

Reject  if ; otherwise, do not reject .

 Where  and df = n - 2.

To investigate the appropriateness of Rule of Thumb #1, the first three columns of Table 1 lists the and t-values needed to be deemed significant, along with the true alpha level for various sample sizes.  For example, when n = 5, the rule requires  0.894.  This correlation coefficient would produce a significant result when  Using the critical t-values of + 3.464 is equivalent to using an alpha of 0.041, i.e., the area in the tails beyond +/-3.464 of the t-distribution with 3 degrees of freedom is 0.041.  Thus for n = 5, the rule is equivalent to a formal t-test with df = 3 and  .  Using  as a general criterion for significance, Rule of Thumb #1 is slightly conservative, i.e., it provides added protection from making a TYPE I error.  It is important to note here that values other than  are sometimes more appropriate.  However, as Table 1 illustrates, the rule is constructed to approximate the usage of .  Therefore, it is recommended that formal hypothesis testing methodology be applied for cases where different alpha values are warranted.

Moving down through Table 1, Rule of Thumb #1 when is equivalent to a t-test with 0.046 < < 0.050.  Thus, the rule is near the generally accepted value of statistical significance, , and is never too liberal (i.e., > 0 .05 and an inflated  Type I error rate).  Therefore, Rule of Thumb #1 is appropriate for students to use when discussing the correlation coefficient prior to the formal discussion of statistical significance.

 

Two Slight Modifications to the Rule of Thumb

            Since the Rule of Thumb #1 can be viewed as slightly conservative, the following is investigated: 

            Rule of Thumb #2:  If , then a linear relationship exists.

Columns 4 – 6 of Table 1 give the resulting and t-values along with the true alphas for Rule of Thumb #2.  For n<35 the rule is too liberal, but for n > 35 it is slightly better (closer to = 0.05) than the original rule of thumb.  For 25 < n < 80 this revised rule performs quite well.  However, Rule of Thumb #1 is preferable since it is never too liberal and slightly easier to use.

            The last three columns of Table 1 investigates the following:

            Rule of Thumb #3:  If , then a linear relationship exists.

An interesting characteristic of Rule of Thumb #3 is that the critical t-value for the corresponding t-test is always +2.  In other words, Rule of Thumb #3 is equivalent to performing the formal t-test and always using a critical t-value of 2 regardless of the sample size.  For n < 60 the rule is too liberal (especially for n < 20), and for n > 60 the rule is too conservative. 

            When performing t-tests on correlations, means, regression parameters, etc., or a wide range of confidence intervals using a t-statistic it is often convenient to use t = 2 as a  good approximation for significance.  For example, Scheaffer, Mendenhall, and Ott (1996) use this convention throughout their survey sampling text (see page 31 for justification).  Table 1 provides insight into how this general rule of thumb is fairly sensitive to sample size (or more general, the degrees of freedom), but note that the true alpha is within  

+ 0.001 of   for 48 <  df  < 88. 

            In summary, Rule of Thumb #3 is inferior to Rule of Thumb #1 or #2 when evaluating a correlation coefficient, but is equivalent to the widely recognized rule of thumb that uses t = 2 for t-tests and t-intervals. 

 

Example

            Berenson, Levine, and Krehbiel (2004, page 123) present a sample of 24 automobile batteries along with their respective price and number of cold-cranking amps (the higher the amps, the more powerful the battery).  Is there a significant correlation between the price of a battery and its power?

Solution  The correlation coefficient is =0.484.  Rule of Thumb #1 states that there is a significant linear relationship if  .  Since 0.484, there is a significant linear relationship between the price and power of automobile batteries.  There is a tendency for the higher priced batteries to provide a larger number of cold-cranking amps and vice-versa.

 

Conclusion

            If introducing the sample correlation coefficient before statistical inference in a statistic course, it is a good idea to discuss the relationship between the observed value of the coefficient, sample size, and statistical significance.  The rule of thumb, , is an appropriate mechanism for doing so.  This paper provides justification for its statistical merit.  Although slightly conservative, the rule of thumb produces a true alpha that never ventures too far from the generally acceptable significance value of 0.05.  Moreover, this paper illustrated the effect of sample size on using t = 2 as a general rule for significance when performing t-tests or constructing t-intervals.

 

REFERENCES

Berenson, M.L., Levine, D.M., & Krehbiel, T.C. (2004).  Basic Business Statistics, 9e, Upper Saddle River, NJ:  Prentice Hall.

Newbold, P., Carlson, W. L., & Thorne, B. M. (2003).  Statistics for Business and Economics, 5e, Upper Saddle River, NJ:  Prentice Hall.

Scheaffer, R. L., Mendenhall, W. III, & Ott, L. (1996).  Elementary Survey Sampling, 5e, Boston:  Duxbury Press.

 

 

 

 

 

 

                                                             

 

 

 

 

 

 

 

Table 1.  True Alpha Levels for 3 Rules of Thumb

 

 

      1.

2. 

3. 

n

df

t

t

t

5

3

.894

3.464

.041

.816

2.449

.092

.756

2

.139

6

4

.816

2.828

.047

.756

2.309

.082

.707

2

.116

7

5

.756

2.582

.049

.707

2.236

.076

.667

2

.102

8

6

.707

2.449

.050

.667

2.191

.071

.632

2

.092

9

7

.667

2.366

.050

.632

2.160

.068

.603

2

.086

10

8

.632

2.309

.050

.603

2.138

.065

.577

2

.081

11

9

.603

2.268

.050

.577

2.121

.063

.555

2

.077

12

10

.577

2.236

.049

.555

2.108

.061

.535

2

.073

13

11

.555

2.211

.049

.535

2.098

.060

.516

2

.071

14

12

.535

2.191

.049

.516

2.089

.059

.500

2

.069

15

13

.516

2.174

.049

.500

2.082

.058

.485

2

.067

16

14

.500

2.160

.049

.485

2.075

.057

.471

2

.065

17

15

.485

2.148

.048

.471

2.070

.056

.459

2

.064

18

16

.471

2.138

.048

.459

2.066

.055

.447

2

.063

19

17

.459

2.129

.048

.447

2.062

.055

.436

2

.062

20

18

.447

2.121

.048

.436

2.058

.054

.426

2

.061

21

19

.436

2.114

.048

.426

2.055

.054

.417

2

.060

22

20

.426

2.108

.048

.417

2.052

.053

.408

2

.059

23

21

.417

2.103

.048

.408

2.049

.053

.400

2

.059

24

22

.408

2.098

.048

.400

2.047

.053

.392

2

.058

25

23

.400

2.093

.048

.392

2.045

.052

.385

2

.057

26

24

.392

2.089

.047

.385

2.043

.052

.378

2

.057

27

25

.385

2.085

.047

.378

2.041

.052

.371

2

.056

28

26

.378

2.082

.047

.371

2.040

.052

.365

2

.056

29

27

.371

2.078

.047

.365

2.038

.051

.359

2

.056

30

28

.365

2.075

.047

.359

2.037

.051

.354

2

.055

31

29

.359

2.073

.047

.354

2.035

.051

.348

2

.055

32

30

.354

2.070

.047

.348

2.034

.051

.343

2

.055

33

31

.348

2.068

.047

.343

2.033

.051

.338

2

.054

34

32

.343

2.066

.047

.338

2.032

.051

.333

2

.054

35

33

.338

2.064

.047

.333

2.031

.050

.329

2

.054

36

34

.333

2.062

.047

.329

2.030

.050

.324

2

.054

37

35

.329

2.060

.047

.324

2.029

.050

.320

2

.053

38

36

.324

2.058

.047

.320

2.028

.050

.316

2

.053

39

37

.320

2.056

.047

.316

2.028

.050

.312

2

.053

40

38

.316

2.055

.047

.312

2.027

.050

.309

2

.053

50

48

.283

2.043

.047

.280

2.021

.049

.277

2

.051

60

58

.258

2.035

.046

.256

2.017

.048

.254

2

.050

70

68

.239

2.030

.046

.237

2.015

.048

.236

2

.049

80

78

.224

2.026

.046

.222

2.013

.048

.221

2

.049

90

88

.211

2.023

.046

.210

2.011

.047

.209

2

.049

100

98

.200

2.021

.046

.199

2.010

.047

.198

2

.048

200

198

.141

2.010

.046

.141

2.005

.046

.141

2

.047

300

298

.115

2.007

.046

.115

2.003

.046

.115

2

.046