In this tutorial we will explore the Bartlett’s test for equality of variances and its application in Python.

[the_ad id=”3031″]

Table of Contents


Introduction

A lot of statistical tests and procedures assume normality of the data and equal variances.

These conditions often constitute whether a researcher can use a parametric or non-parametric test, formulate their hypotheses in certain ways, and much more.

Bartlett’s test is one of the popular tests in inferential statistics that addresses the data drawn from normal distribution.

If the data follows a non-normal distribution, please consider using the Levene’s test (since Bartlett’s test is very sensitive to deviations from normality).

What is Bartlett’s test?

Bartlett’s test is used to test for equality of variances of a variable calculated for two or more groups (samples).

How do you interpret Bartlett’s test?

If the p-value of Bartlett’s test is less than the significance level (for example 0.05), the variances of at least two groups are not equal.


To continue following this tutorial we will need the following Python libraries: pandas and scipy.

If you don’t have it installed, please open “Command Prompt” (on Windows) and install it using the following code:


pip install pandas
pip install scipy

Books I recommend:

[the_ad id=”2889″]
[the_ad id=”2891″]
[the_ad id=”2892″]
[the_ad id=’2894′]

Sample data

To perform the calculations mentioned in this tutorial in the example section and the Python implementation section, you will need some data to work with.

Throughout the examples in this tutorial, the data from the .csv file available below is used.

The data contains 80 observations of reactions to new treatment across three groups: ‘control’, ‘treatment1’, and ‘treatment2’.

The data for before treatment effect follows a non-normal distribution.

[the_ad id=”3031″]

Bartlett’s test explained

As mentioned earlier, the assumption of equality of variances is important in statistical analysis and often shapes the researcher’s procedures when working on measuring the outcomes of experiments and data analysis.


Bartlett’s test hypotheses

The null hypothesis of Bartlett’s test is that all groups have equal variances.

The alternative hypothesis of Bartlett’s test is that at least one pair of groups has unequal variances.

Given a variable \(Y\) of the size \(N\), which is split into \(k\) groups:

$$H_0 : \sigma_1^2 = \sigma_2^2 = … = \sigma_k^2$$

$$H_1 : \sigma_i^2 \neq \sigma_j^2$$

where:

  • \(k\) : total number of groups (\(\geq\)2)
  • \(i\) : one of the \(k\) groups
  • \(j\) : one of the \(k\) groups
  • \((i, j)\) : a pair of groups from \(k\) groups
  • \(i \neq j\) : two groups are not the same group

Bartlett’s test statistic

The Bartlett’s test statistic is given by:

$$T = \frac{(N-k)\ln s^2_p – \sum_{i=1}^k (n_i – 1) \ln s^2_i}{1 + \frac{1}{3(k-1)} \times (\sum_{i=1}^k (\frac{1}{n_i-1})\frac{1}{N-k})}$$

where:

  • \(N\) : total number of observations
  • \(n_i\) : number of observations in \(i\)-th group
  • \(k\) : number of groups
  • \(s^2_i\) : variance of \(i\)-th group
  • \(s^2_p\) : pooled variance

After the Bartlett’s test statistic (\(T\)) is calculated, it should be compared against the upper critical value given by:

$$\chi^2_{1-\alpha, k-1}$$

Therefore, we reject the null hypotheses of equal variances when:

$$T > F_{\chi^2_{1-\alpha, k-1}}$$

[the_ad id=”3031″]

Bartlett’s test example in Python

In order to see Bartlett’s test in practice and its application in Python, we will use the sample data file mentioned in one of the previous sections.

First, import the required dependencies:


import pandas as pd
from scipy.stats import bartlett

Then read the .csv file provided into a Pandas DataFrame and print first few rows:


df = pd.read_csv('data_bartletts_test.csv')

print(df.head())

And you should get:

     group  before_treatment  after_treatment
0  control              27.9             33.8
1  control              16.8              9.3
2  control              27.2             23.4
3  control              12.5             19.9
4  control              14.4             16.0

You can compute some group summary statistics to understand the data better:


df_agg = (
    df.groupby("group")
    .agg(
        avg_bef_treat=("before_treatment", "mean"),
        var_bef_treat=("before_treatment", "var"),
        avg_aft_treat=("after_treatment", "mean"),
        var_aft_treat=("after_treatment", "var"),
    )
    .reset_index()
)

print(df_agg)

And you should get:

        group  avg_bef_tr  var_bef_tr  avg_aft_tr  var_aft_tr
0     control      20.145   18.878436      19.825   28.825513
1  treatment1      19.210   17.007263      15.475    4.649342
2  treatment2      21.510   19.673579      20.315   15.141458

Here you can clearly see the difference between “var_bef_tr” (variance before treatment) across 3 groups is not that different: 18.88, 17.01, 19.67.

The difference in variance before treatment across three groups is small enough for us to be almost sure that it is not significantly different, but to check it statistically we will perform the Bartlett’s test in Python!


We will need to create the variables which will store observations relevant to a particular group:


control_group = df[df['group']=='control']['before_treatment']

treatment1_group = df[df['group']=='treatment1']['before_treatment']

treatment2_group = df[df['group']=='treatment2']['before_treatment']

Finally perform the Bartlett’s test in Python:


stat, p_value = bartlett(control_group, treatment1_group, treatment2_group)

print(f"Bartlett's test statistic: {stat}")
print(f"P-value: {p-value}")

You should get:

Bartlett's test statistic: 0.10660625260772809
P-value: 0.9480925771212662

Since the p-value is greater than 0.05, we fail to reject the null hypothesis and conclude that there are equal variances across 3 groups.

The official documentation shows how to change the function used in the test.

[the_ad id=”3031″]

Conclusion

In this article we discussed how to perform the Bartlett’s test for equality of variances and its application in Python using scipy library.

Feel free to leave comments below if you have any questions or have suggestions for some edits and check out more of my Statistics articles.