What is Cohen's D in Python?

Effect size measure is one of the most important tools in statistical analysis. Effect size measures determine the strength of the relationship between two groups and their degree of difference. For example, if we have two groups in a class namely, A and B, and they have mean scores of 75 and 80, we can use the effect size measure to quantify their association or difference.

Cohen’s d is an effect size measure in statistics that quantifies the difference between two group means in standard deviation units. Values range from -1 to 1, with 0 indicating no effect. Larger absolute values indicate a bigger effect size and difference between groups.

Cohen’s D is one such kind of effect size measure. It compares the differences between the two groups in a standardized manner. It provides insights about research findings and finds its applications in many fields such as clinical trials and educational evaluations. In this article, we will explore want Cohen’s D is and how we can implement it in Python using the numerical Python or NumPy library. Let’s get started!

What is Cohen’s d and How is it Interpreted

Cohen’s D is an effect size measure that is used to assess the degree of difference between two groups in terms of standard deviation units. It can be used to discuss the differences between two groups regardless of their scales. The formula for Cohen’s d is obtained by calculating the difference between the means of two groups and then dividing it by the pooled standard deviation.

d= (mean of group A (x₁ ) + mean of group B (x₂ )) / pooled standard deviation( s_p )

The range of Cohen’s D is :

If the value of Cohen’s D is around 0, it indicates no or minimal difference between the two groups.
If the value is positive, it indicates that the mean of the first group is higher than that of the second group, and vice versa if it’s negative.
The magnitude indicates the size of the difference, the larger the magnitude, the bigger the difference.

Let’s understand this with an example. If the mean of group A is 75 and the mean of group B is 80, with the assumption that the standard deviation is 10, we can easily calculate Cohen’s D with the given formula. But why do I need it when I see that Group A has a lesser mean than Group B? The answer is: This observation doesn’t give you the degree of difference between the two.

Hence, using the formula we see that in this case, Cohen’s D = -0.5.

This value of -0.5, signifies a moderate effect size range. This says that there is a moderate difference between the two groups’ means.

In simple terms, effect size helps us understand not just whether there’s a difference between groups or variables, but also how big that difference is.

Calculating Cohen’s d in Python

In this section, we will implement this measure in Python. We will initialize two arrays as inputs for two different groups and calculate the mean and standard deviations using the numpy library. Then by calculating the pooled standard deviation, we will calculate the Cohen’s D. All of these steps are calculated with in-built functions and you don’t need to implement these manually.

#importing numpy for using in-built functions
import numpy as np

def cohens_d(group1, group2):
    # Calculating means of the two groups
    mean1, mean2 = np.mean(group1), np.mean(group2)
    
    # Calculating pooled standard deviation
    std1, std2 = np.std(group1, ddof=1), np.std(group2, ddof=1)
    n1, n2 = len(group1), len(group2)
    pooled_std = np.sqrt(((n1 - 1) * std1 ** 2 + (n2 - 1) * std2 ** 2) / (n1 + n2 - 2))
    
    # Calculating Cohen's d
    d = (mean1 - mean2) / pooled_std
    
    return d

# Example data for two groups
group1 = np.array([5, 7, 9, 11, 13])
group2 = np.array([6, 8, 10, 12, 14])

# Calculating Cohen's d
effect_size = cohens_d(group1, group2)
print("Cohen's d:", effect_size)

The output would be:

Cohen's d: -0.31622776601683794

You might also like: NumPy Python: Calculating Auto-Covariance.

Summary

Cohen’s D finds its application in various fields including psychological studies, social sciences and business and organization research. In this article, we have gone through what effect size measure is and how Cohen’s D is one of the most widely used effect size measure. The range of Cohen’s D is from -1 to 1 and various magnitudes indicates the degreee of difference between two groups. In python, due to the availability of a wide range of in-built function, it is very easy to calculate cohen’s D by simply creating a user defined function by the same name.