Basic: Plotting the distribution of missing valuesΒΆ

UpSet plots are often used to show which variables are missing together.

Passing a callable indicators=pd.isna to from_indicators() is an easy way to categorise a record by the variables that are missing in it.

plot missingness
import pandas as pd
from matplotlib import pyplot as plt

from upsetplot import from_indicators, plot

TITANIC_URL = (
    "https://raw.githubusercontent.com/datasciencedojo/datasets/master/titanic.csv"  # noqa
)
data = pd.read_csv(TITANIC_URL)

plot(from_indicators(indicators=pd.isna, data=data), show_counts=True)
plt.show()

Total running time of the script: ( 0 minutes 0.283 seconds)

Gallery generated by Sphinx-Gallery