API Reference

Plotting

upsetplot.plot(data, fig=None, **kwargs)[source]

Make an UpSet plot of data on fig

Parameters:
data : pandas.Series or pandas.DataFrame

Values for each set to plot. Should have multi-index where each level is binary, corresponding to set membership. If a DataFrame, sum_over must be a string or False.

fig : matplotlib.figure.Figure, optional

Defaults to a new figure.

kwargs

Other arguments for UpSet

Returns:
subplots : dict of matplotlib.axes.Axes

Keys are ‘matrix’, ‘intersections’, ‘totals’, ‘shading’

class upsetplot.UpSet(data, orientation='horizontal', sort_by='degree', sort_sets_by='cardinality', sum_over=None, facecolor='black', with_lines=True, element_size=32, intersection_plot_elements=6, totals_plot_elements=2, show_counts='')[source]

Manage the data and drawing for a basic UpSet plot

Primary public method is plot().

Parameters:
data : pandas.Series or pandas.DataFrame

Values for each set to plot. Should have multi-index where each level is binary, corresponding to set membership. If a DataFrame, sum_over must be a string or False.

orientation : {‘horizontal’ (default), ‘vertical’}

If horizontal, intersections are listed from left to right.

sort_by : {‘cardinality’, ‘degree’}

If ‘cardinality’, set intersections are listed from largest to smallest value. If ‘degree’, they are listed in order of the number of sets intersected.

sort_sets_by : {‘cardinality’, None}

Whether to sort the overall sets by total cardinality, or leave them in the provided order.

sum_over : str, False or None (default)

Must be specified when data is a DataFrame. If False, the intersection plot will show the count of each subset. Otherwise, it shows the sum of the specified field.

facecolor : str

Color for bar charts and dots.

with_lines : bool

Whether to show lines joining dots in the matrix, to mark multiple sets being intersected.

element_size : float or None

Side length in pt. If None, size is estimated to fit figure

intersection_plot_elements : int

The intersections plot should be large enough to fit this many matrix elements.

totals_plot_elements : int

The totals plot should be large enough to fit this many matrix elements.

show_counts : bool or str, default=False

Whether to label the intersection size bars with the cardinality of the intersection. When a string, this formats the number. For example, ‘%d’ is equivalent to True.

Methods

add_catplot(self, kind[, value, elements]) Add a seaborn catplot over subsets when plot() is called.
make_grid(self[, fig]) Get a SubplotSpec for each Axes, accounting for label text width
plot(self[, fig]) Draw all parts of the plot onto fig or a new figure
plot_intersections(self, ax) Plot bars indicating intersection size
plot_matrix(self, ax) Plot the matrix of intersection indicators onto ax
plot_totals(self, ax) Plot bars indicating total set size
plot_shading  
add_catplot(self, kind, value=None, elements=3, **kw)[source]

Add a seaborn catplot over subsets when plot() is called.

Parameters:
kind : str

One of {“point”, “bar”, “strip”, “swarm”, “box”, “violin”, “boxen”}

value : str, optional

Column name for the value to plot (i.e. y if orientation=’horizontal’), required if data is a DataFrame.

elements : int, default=3

Size of the axes counted in number of matrix elements.

**kw : dict

Additional keywords to pass to seaborn.catplot().

Our implementation automatically determines ‘ax’, ‘data’, ‘x’, ‘y’ and ‘orient’, so these are prohibited keys in kw.

Returns:
None
make_grid(self, fig=None)[source]

Get a SubplotSpec for each Axes, accounting for label text width

plot(self, fig=None)[source]

Draw all parts of the plot onto fig or a new figure

Parameters:
fig : matplotlib.figure.Figure, optional

Defaults to a new figure.

Returns:
subplots : dict of matplotlib.axes.Axes

Keys are ‘matrix’, ‘intersections’, ‘totals’, ‘shading’

plot_intersections(self, ax)[source]

Plot bars indicating intersection size

plot_matrix(self, ax)[source]

Plot the matrix of intersection indicators onto ax

plot_totals(self, ax)[source]

Plot bars indicating total set size

Dataset loading and generation

upsetplot.from_memberships(memberships, data=None)[source]

Load data where each sample has a collection of set names

The output should be suitable for passing to UpSet or plot.

Parameters:
memberships : sequence of collections of strings

Each element corresponds to a data point, indicating the sets it is a member of. Each set is named by a string.

data : Series-like or DataFrame-like, optional

If given, the index of set memberships is attached to this data. It must have the same length as memberships. If not given, the series will contain the value 1.

Returns:
DataFrame or Series

data is returned with its index indicating set membership. It will be a Series if data is a Series or 1d numeric array. The index will have levels ordered by set names.

Examples

>>> from upsetplot import from_memberships
>>> from_memberships([
...     ['set1', 'set3'],
...     ['set2', 'set3'],
...     ['set1'],
...     []
... ])  # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE
set1   set2   set3
True   False  True     1
False  True   True     1
True   False  False    1
False  False  False    1
Name: ones, dtype: ...
>>> # now with data:
>>> import numpy as np
>>> from_memberships([
...     ['set1', 'set3'],
...     ['set2', 'set3'],
...     ['set1'],
...     []
... ], data=np.arange(12).reshape(4, 3))  # doctest: +NORMALIZE_WHITESPACE
                   0   1   2
set1  set2  set3
True  False True   0   1   2
False True  True   3   4   5
True  False False  6   7   8
False False False  9  10  11
upsetplot.generate_data(seed=0, n_samples=10000, n_sets=3, aggregated=False)[source]