|
CORRELATION COEFFICIENT RULE OF THUMB
Timothy C. Krehbiel
Professor of Decision Sciences and MIS
Richard T. Farmer School of Business
Miami University
Oxford, OH 45056 USA
krehbitc@muohio.edu
513-529-4837
Originally submitted to
The Decision Science Journal of Innovative Education, June 3,
2003. Revision #1 submitted September 12, 2003.
Timothy C. Krehbiel is
Professor of Decision Sciences and MIS in the Richard T. Farmer School
of Business. His research interests include quality improvement, total
quality environmental management, and statistics education. His
research has appeared in numerous journals including Teaching
Statistics, Journal of Education for Business, Communications in
Statistics, Quality Management Journal, and International
Journal of Production Research. He is co-author of four books;
Sustainability Perspectives for Resources and Business, Basic
Business Statistics, Business Statistics: A First Course,
and Statistics for Managers Using Microsoft Excel. In 1996, he
was one of five Miami University professors to win the Instructional
Innovation Award from the Decision Sciences Institute. In 2000 he
received the Richard T. Farmer School of Business Teaching Effectiveness
Award.
CORRELATION COEFFICIENT RULE OF THUMB
Introduction
Many
introductory business statistics textbooks introduce the sample
correlation coefficient in the descriptive statistics chapter. Since
this material is prior to the introduction of statistical inference and
significance, little can be said about interpreting the coefficient.
These texts point out that a value around zero implies little or no
linear relationship between the two variables, and a value near positive
(negative) one implies a strong positive (negative) linear relationship
between the two variables. Although uncommon, at least one text (see
Newbold, Carlson and Thorne (2003), page 63) introduces the following
rule of thumb to help students decide if the observed value of the
correlation coefficient is significant:
Rule of Thumb #1: If
, then a linear relationship exists.
This paper provides
statistical justification for the rule’s use. Moreover, a slight change
to the rule of thumb provides an interesting insight concerning a
commonly applied approximation: When performing t-tests or
constructing t-intervals, use t=2 instead of an exact
value from the t-distribution. The paper concludes that Rule of
Thumb #1 allows educators to introduce correlation analysis prior to a
formal introduction to statistical inference and thus provides an
effective bridge between descriptive and inferential statistics. A
practical business example is included.
True Alpha Level for
the Rule of Thumb
How
close a correlation value needs to be to +1 to prove statistical
significance depends on the sample size. To test whether a linear
relationship exists the following hypotheses are used:
versus

And the rejection rule:
Reject if ; otherwise, do not reject
.
Where and df = n - 2.
To
investigate the appropriateness of Rule of Thumb #1, the first three
columns of Table 1 lists the
and t-values needed to be deemed significant, along
with the true alpha level for various sample sizes. For example, when n
= 5, the rule requires 0.894. This correlation coefficient would produce a
significant result when Using the critical t-values of + 3.464 is
equivalent to using an alpha of 0.041, i.e., the area in the tails
beyond +/-3.464 of the t-distribution with 3 degrees of freedom
is 0.041. Thus for n = 5, the rule is equivalent to a formal t-test
with df = 3 and . Using as a general criterion for significance, Rule of
Thumb #1 is slightly conservative, i.e., it provides added protection
from making a TYPE I error. It is important to note here that values
other than are sometimes more appropriate. However, as Table 1
illustrates, the rule is constructed to approximate the usage of
. Therefore, it is recommended that formal hypothesis
testing methodology be applied for cases where different alpha values
are warranted.
Moving
down through Table 1, Rule of Thumb #1 when
is equivalent to a t-test with 0.046 <
< 0.050. Thus, the rule is near the generally
accepted value of statistical significance,
, and is never too liberal (i.e.,
> 0 .05 and an inflated Type I error rate).
Therefore, Rule of Thumb #1 is appropriate for students to use when
discussing the correlation coefficient prior to the formal discussion of
statistical significance.
Two Slight
Modifications to the Rule of Thumb
Since the
Rule of Thumb #1 can be viewed as slightly conservative, the following
is investigated:
Rule of
Thumb #2: If , then a linear relationship exists.
Columns 4 – 6 of Table 1
give the resulting and t-values along with the true alphas for Rule of
Thumb #2. For n<35 the rule is too liberal, but for n > 35 it is
slightly better (closer to = 0.05) than the original rule of thumb. For 25 <
n < 80 this revised rule performs quite well. However, Rule
of Thumb #1 is preferable since it is never too liberal and slightly
easier to use.
The last
three columns of Table 1 investigates the following:
Rule of
Thumb #3: If , then a linear relationship exists.
An interesting
characteristic of Rule of Thumb #3 is that the critical t-value for the
corresponding t-test is always +2. In other words, Rule of Thumb
#3 is equivalent to performing the formal t-test and always using a
critical t-value of 2 regardless of the sample size. For n < 60 the
rule is too liberal (especially for n < 20), and for n > 60 the rule is
too conservative.
When
performing t-tests on correlations, means, regression parameters, etc.,
or a wide range of confidence intervals using a t-statistic it is often
convenient to use t = 2 as a good approximation for significance. For
example, Scheaffer, Mendenhall, and Ott (1996) use this convention
throughout their survey sampling text (see page 31 for justification).
Table 1 provides insight into how this general rule of thumb is fairly
sensitive to sample size (or more general, the degrees of freedom), but
note that the true alpha is within
+ 0.001 of
for 48 < df < 88.
In summary,
Rule of Thumb #3 is inferior to Rule of Thumb #1 or #2 when evaluating a
correlation coefficient, but is equivalent to the widely recognized rule
of thumb that uses t = 2 for t-tests and t-intervals.
Example
Berenson,
Levine, and Krehbiel (2004, page 123) present a sample of 24 automobile
batteries along with their respective price and number of cold-cranking
amps (the higher the amps, the more powerful the battery). Is there a
significant correlation between the price of a battery and its power?
Solution The correlation coefficient is
=0.484. Rule of Thumb #1 states that there is a
significant linear relationship if
. Since 0.484 , there is a significant linear relationship between
the price and power of automobile batteries. There is a tendency for
the higher priced batteries to provide a larger number of cold-cranking
amps and vice-versa.
Conclusion
If
introducing the sample correlation coefficient before statistical
inference in a statistic course, it is a good idea to discuss the
relationship between the observed value of the coefficient, sample size,
and statistical significance. The rule of thumb,
, is an appropriate mechanism for doing so. This
paper provides justification for its statistical merit. Although
slightly conservative, the rule of thumb produces a true alpha that
never ventures too far from the generally acceptable significance value
of 0.05. Moreover, this paper illustrated the effect of sample size on
using t = 2 as a general rule for significance when performing t-tests
or constructing t-intervals.
REFERENCES
Berenson, M.L., Levine, D.M., & Krehbiel, T.C. (2004). Basic
Business Statistics, 9e, Upper Saddle River, NJ: Prentice Hall.
Newbold, P., Carlson, W. L., & Thorne, B. M. (2003). Statistics for
Business and Economics, 5e, Upper Saddle River, NJ: Prentice Hall.
Scheaffer, R. L., Mendenhall, W. III, & Ott, L. (1996). Elementary
Survey Sampling, 5e, Boston: Duxbury Press.
Table 1. True Alpha Levels for 3 Rules
of Thumb
|
|
|
1.
 |
2.
 |
3.
 |
|
n |
df |
 |
t |
 |
 |
t |
 |
 |
t |
 |
|
5 |
3 |
.894 |
3.464 |
.041 |
.816 |
2.449 |
.092 |
.756 |
2 |
.139 |
|
6 |
4 |
.816 |
2.828 |
.047 |
.756 |
2.309 |
.082 |
.707 |
2 |
.116 |
|
7 |
5 |
.756 |
2.582 |
.049 |
.707 |
2.236 |
.076 |
.667 |
2 |
.102 |
|
8 |
6 |
.707 |
2.449 |
.050 |
.667 |
2.191 |
.071 |
.632 |
2 |
.092 |
|
9 |
7 |
.667 |
2.366 |
.050 |
.632 |
2.160 |
.068 |
.603 |
2 |
.086 |
|
10 |
8 |
.632 |
2.309 |
.050 |
.603 |
2.138 |
.065 |
.577 |
2 |
.081 |
|
11 |
9 |
.603 |
2.268 |
.050 |
.577 |
2.121 |
.063 |
.555 |
2 |
.077 |
|
12 |
10 |
.577 |
2.236 |
.049 |
.555 |
2.108 |
.061 |
.535 |
2 |
.073 |
|
13 |
11 |
.555 |
2.211 |
.049 |
.535 |
2.098 |
.060 |
.516 |
2 |
.071 |
|
14 |
12 |
.535 |
2.191 |
.049 |
.516 |
2.089 |
.059 |
.500 |
2 |
.069 |
|
15 |
13 |
.516 |
2.174 |
.049 |
.500 |
2.082 |
.058 |
.485 |
2 |
.067 |
|
16 |
14 |
.500 |
2.160 |
.049 |
.485 |
2.075 |
.057 |
.471 |
2 |
.065 |
|
17 |
15 |
.485 |
2.148 |
.048 |
.471 |
2.070 |
.056 |
.459 |
2 |
.064 |
|
18 |
16 |
.471 |
2.138 |
.048 |
.459 |
2.066 |
.055 |
.447 |
2 |
.063 |
|
19 |
17 |
.459 |
2.129 |
.048 |
.447 |
2.062 |
.055 |
.436 |
2 |
.062 |
|
20 |
18 |
.447 |
2.121 |
.048 |
.436 |
2.058 |
.054 |
.426 |
2 |
.061 |
|
21 |
19 |
.436 |
2.114 |
.048 |
.426 |
2.055 |
.054 |
.417 |
2 |
.060 |
|
22 |
20 |
.426 |
2.108 |
.048 |
.417 |
2.052 |
.053 |
.408 |
2 |
.059 |
|
23 |
21 |
.417 |
2.103 |
.048 |
.408 |
2.049 |
.053 |
.400 |
2 |
.059 |
|
24 |
22 |
.408 |
2.098 |
.048 |
.400 |
2.047 |
.053 |
.392 |
2 |
.058 |
|
25 |
23 |
.400 |
2.093 |
.048 |
.392 |
2.045 |
.052 |
.385 |
2 |
.057 |
|
26 |
24 |
.392 |
2.089 |
.047 |
.385 |
2.043 |
.052 |
.378 |
2 |
.057 |
|
27 |
25 |
.385 |
2.085 |
.047 |
.378 |
2.041 |
.052 |
.371 |
2 |
.056 |
|
28 |
26 |
.378 |
2.082 |
.047 |
.371 |
2.040 |
.052 |
.365 |
2 |
.056 |
|
29 |
27 |
.371 |
2.078 |
.047 |
.365 |
2.038 |
.051 |
.359 |
2 |
.056 |
|
30 |
28 |
.365 |
2.075 |
.047 |
.359 |
2.037 |
.051 |
.354 |
2 |
.055 |
|
31 |
29 |
.359 |
2.073 |
.047 |
.354 |
2.035 |
.051 |
.348 |
2 |
.055 |
|
32 |
30 |
.354 |
2.070 |
.047 |
.348 |
2.034 |
.051 |
.343 |
2 |
.055 |
|
33 |
31 |
.348 |
2.068 |
.047 |
.343 |
2.033 |
.051 |
.338 |
2 |
.054 |
|
34 |
32 |
.343 |
2.066 |
.047 |
.338 |
2.032 |
.051 |
.333 |
2 |
.054 |
|
35 |
33 |
.338 |
2.064 |
.047 |
.333 |
2.031 |
.050 |
.329 |
2 |
.054 |
|
36 |
34 |
.333 |
2.062 |
.047 |
.329 |
2.030 |
.050 |
.324 |
2 |
.054 |
|
37 |
35 |
.329 |
2.060 |
.047 |
.324 |
2.029 |
.050 |
.320 |
2 |
.053 |
|
38 |
36 |
.324 |
2.058 |
.047 |
.320 |
2.028 |
.050 |
.316 |
2 |
.053 |
|
39 |
37 |
.320 |
2.056 |
.047 |
.316 |
2.028 |
.050 |
.312 |
2 |
.053 |
|
40 |
38 |
.316 |
2.055 |
.047 |
.312 |
2.027 |
.050 |
.309 |
2 |
.053 |
|
50 |
48 |
.283 |
2.043 |
.047 |
.280 |
2.021 |
.049 |
.277 |
2 |
.051 |
|
60 |
58 |
.258 |
2.035 |
.046 |
.256 |
2.017 |
.048 |
.254 |
2 |
.050 |
|
70 |
68 |
.239 |
2.030 |
.046 |
.237 |
2.015 |
.048 |
.236 |
2 |
.049 |
|
80 |
78 |
.224 |
2.026 |
.046 |
.222 |
2.013 |
.048 |
.221 |
2 |
.049 |
|
90 |
88 |
.211 |
2.023 |
.046 |
.210 |
2.011 |
.047 |
.209 |
2 |
.049 |
|
100 |
98 |
.200 |
2.021 |
.046 |
.199 |
2.010 |
.047 |
.198 |
2 |
.048 |
|
200 |
198 |
.141 |
2.010 |
.046 |
.141 |
2.005 |
.046 |
.141 |
2 |
.047 |
|
300 |
298 |
.115 |
2.007 |
.046 |
.115 |
2.003 |
.046 |
.115 |
2 |
.046 |
|