Pearson Correlation Coefficient

July 4, 2021
Posted by: admin
Categories: SPSS Analysis Help, Statistical Test

Why is Pearson’s correlation used?

Pearson Correlation Coefficient is typically used to describe the strength of the linear relationship between two quantitative variables. Often, these two variables are designated X (predictor) and Y (outcome). Pearson’s r has values that range from −1.00 to +1.00. The sign of r provides information about the direction of the relationship between X and Y. A positive correlation indicates that as scores on X increase, scores on Y also tend to increase; a negative correlation indicates that as scores on X increase, scores on Y tend to decrease; and a correlation near 0 indicates that as scores on X increase, scores on Y neither increase nor decrease in a linear manner.

What does the Correlation coefficient tell you?

The absolute magnitude of Pearson’s r provides information about the strength of the linear association between scores on X and Y. For values of r close to 0, there is no linear association between X and Y. When r = +1.00, there is a perfect positive linear association; when r = −1.00, there is a perfect negative linear association. Intermediate values of r correspond to intermediate strength of the relationship. Figures 7.2 through 7.5 show examples of data for which the correlations are r = +.75, r = +.50, r = +.23, and r = .00.

Pearson’s r is a standardized or unit-free index of the strength of the linear relationship between two variables. No matter what units are used to express the scores on the X and Y variables, the possible values of Pearson’s r range from –1 (a perfect negative linear relationship) to +1 (a perfect positive linear relationship).

How do you explain correlation analysis?

Consider, for example, a correlation between height and weight. Height could be measured in inches, centimeters, or feet; weight could be measured in ounces, pounds, or kilograms. When we correlate scores on height and weight for a given sample of people, the correlation has the same value no matter which of these units are used to measure height and weight. This happens because the scores on X and Y are converted to z scores (i.e., they are converted to unit-free or standardized distances from their means) during the computation of Pearson’s r.

What is the null hypothesis for Pearson correlation?

A correlation coefficient may be tested to determine whether the coefficient significantly differs from zero. The value r is obtained on a sample. The value rho (ρ) is the population’s correlation coefficient. It is hoped that r closely approximates rho. The null and alternative hypotheses are as follows:

H0: ρ = 0

Ha: ρ ≠ 0

What are the assumptions for Pearson correlation?

The assumptions for the Pearson correlation coefficient are as follows: level of measurement, related pairs, absence of outliers, normality of variables, linearity, and homoscedasticity.

Linear Relationship

When using the Pearson correlation coefficient, it is assumed that the cluster of points is the best fit by a straight line.

Homoscedasticity

A second assumption of the correlation coefficient is that of homoscedasticity. This assumption is met if the distance from the points to the line is relatively equal all along the line.