API Reference¶
Plotting¶
-
upsetplot.
plot
(data, fig=None, **kwargs)[source]¶ Make an UpSet plot of data on fig
Parameters: - data : pandas.Series or pandas.DataFrame
Values for each set to plot. Should have multi-index where each level is binary, corresponding to set membership. If a DataFrame,
sum_over
must be a string or False.- fig : matplotlib.figure.Figure, optional
Defaults to a new figure.
- kwargs
Other arguments for
UpSet
Returns: - subplots : dict of matplotlib.axes.Axes
Keys are ‘matrix’, ‘intersections’, ‘totals’, ‘shading’
-
class
upsetplot.
UpSet
(data, orientation='horizontal', sort_by='degree', sort_sets_by='cardinality', sum_over=None, facecolor='black', with_lines=True, element_size=32, intersection_plot_elements=6, totals_plot_elements=2, show_counts='')[source]¶ Manage the data and drawing for a basic UpSet plot
Primary public method is
plot()
.Parameters: - data : pandas.Series or pandas.DataFrame
Values for each set to plot. Should have multi-index where each level is binary, corresponding to set membership. If a DataFrame,
sum_over
must be a string or False.- orientation : {‘horizontal’ (default), ‘vertical’}
If horizontal, intersections are listed from left to right.
- sort_by : {‘cardinality’, ‘degree’}
If ‘cardinality’, set intersections are listed from largest to smallest value. If ‘degree’, they are listed in order of the number of sets intersected.
- sort_sets_by : {‘cardinality’, None}
Whether to sort the overall sets by total cardinality, or leave them in the provided order.
- sum_over : str, False or None (default)
Must be specified when
data
is a DataFrame. If False, the intersection plot will show the count of each subset. Otherwise, it shows the sum of the specified field.- facecolor : str
Color for bar charts and dots.
- with_lines : bool
Whether to show lines joining dots in the matrix, to mark multiple sets being intersected.
- element_size : float or None
Side length in pt. If None, size is estimated to fit figure
- intersection_plot_elements : int
The intersections plot should be large enough to fit this many matrix elements.
- totals_plot_elements : int
The totals plot should be large enough to fit this many matrix elements.
- show_counts : bool or str, default=False
Whether to label the intersection size bars with the cardinality of the intersection. When a string, this formats the number. For example, ‘%d’ is equivalent to True.
Methods
add_catplot
(self, kind[, value, elements])Add a seaborn catplot over subsets when plot()
is called.make_grid
(self[, fig])Get a SubplotSpec for each Axes, accounting for label text width plot
(self[, fig])Draw all parts of the plot onto fig or a new figure plot_intersections
(self, ax)Plot bars indicating intersection size plot_matrix
(self, ax)Plot the matrix of intersection indicators onto ax plot_totals
(self, ax)Plot bars indicating total set size plot_shading -
add_catplot
(self, kind, value=None, elements=3, **kw)[source]¶ Add a seaborn catplot over subsets when
plot()
is called.Parameters: - kind : str
One of {“point”, “bar”, “strip”, “swarm”, “box”, “violin”, “boxen”}
- value : str, optional
Column name for the value to plot (i.e. y if orientation=’horizontal’), required if
data
is a DataFrame.- elements : int, default=3
Size of the axes counted in number of matrix elements.
- **kw : dict
Additional keywords to pass to
seaborn.catplot()
.Our implementation automatically determines ‘ax’, ‘data’, ‘x’, ‘y’ and ‘orient’, so these are prohibited keys in
kw
.
Returns: - None
Dataset loading and generation¶
-
upsetplot.
from_memberships
(memberships, data=None)[source]¶ Load data where each sample has a collection of set names
The output should be suitable for passing to
UpSet
orplot
.Parameters: - memberships : sequence of collections of strings
Each element corresponds to a data point, indicating the sets it is a member of. Each set is named by a string.
- data : Series-like or DataFrame-like, optional
If given, the index of set memberships is attached to this data. It must have the same length as
memberships
. If not given, the series will contain the value 1.
Returns: - DataFrame or Series
data
is returned with its index indicating set membership. It will be a Series ifdata
is a Series or 1d numeric array. The index will have levels ordered by set names.
Examples
>>> from upsetplot import from_memberships >>> from_memberships([ ... ['set1', 'set3'], ... ['set2', 'set3'], ... ['set1'], ... [] ... ]) # doctest: +ELLIPSIS, +NORMALIZE_WHITESPACE set1 set2 set3 True False True 1 False True True 1 True False False 1 False False False 1 Name: ones, dtype: ... >>> # now with data: >>> import numpy as np >>> from_memberships([ ... ['set1', 'set3'], ... ['set2', 'set3'], ... ['set1'], ... [] ... ], data=np.arange(12).reshape(4, 3)) # doctest: +NORMALIZE_WHITESPACE 0 1 2 set1 set2 set3 True False True 0 1 2 False True True 3 4 5 True False False 6 7 8 False False False 9 10 11