Description
Problem
Currently, if one wants to draw multiple categories of bars side-by-side as in https://matplotlib.org/devdocs/gallery/lines_bars_and_markers/barchart.html, one has to calculate the bar positions manually. This is really a nuissance and too low level for any user-facing interface. Actually, when asked I recommend to use pandas plotting functions for that if possible, which is really embarrasing.
There have been stalled attempts to do this (issure 10610, PR #11048). Additionally, this often gets related to a function for stacked bars #14086.
Proposed solution
I'd like to pick up this topic and come up with a reasonable API. This topic is really complex and one can easily get lost in various details. For the design procedure, I take a bottom-up appoach by starting with a basic function grouped_bar()
that does only expose the minimal functionality to get the plot done. I then intended to additional parameters one by one as they fit in. - So don't be concerned that the first proposal here is quite basic.
Terminology I'll use label for the x-values, i.e. 'G1' .. 'G5' in above example, and group for the categories, i.e. 'Tea'/'Coffee'.
For now: Only vertical orientation
I'll limit the discussion here to 'vertical'. We can have a separate discussion whether we want to add an orientation parameter or make a grouped_barh
. Both are technically easy and only an API design decision that's orthogonal to the rest of the API.
For now: Only grouped layout, no stacked layout
To keep things simple, I'll limit myself to grouped for now, because:
- Stacked is somewhat simpler than grouped as you only need to insert the bottom values in multiple calls. For two bars it's just the heights of the firsts, and a cumsum for more than two.
- We may build a separate
stacked_bar()
function, if the first bullet point is considered too cumbersome. - It's conceivable to unite both in one function as poposed in Feature: Plot multiple bars with one call #11048 and realized in
DataFrame.plot.bar
, but that needs careful additional consideration.
either way let's defer stacked bars to later.
Minimal API
We want to be able to rewrite https://matplotlib.org/devdocs/gallery/lines_bars_and_markers/barchart.html as
grouped_bar(labels, [tea_means, coffee_means], group_labels=['Tea', 'Coffee'])
Thus, the minimal API is:
def grouped_bar(x, heights, *, group_labels=None):
"""
Parameters
-----------
x : array-like of str
The labels.
heights : list of array-like:
An iterable of array-like: The iteration runs over the groups.
Each individual array-like is the list of label values for that group.
group_labels : array-like of str, optional
The labels of the data groups.
"""
I'll soon expand on the minimal API, answering a lot of questions from #11048 (comment). But before that, please speak up in case you have fundamental concerns with adding such functionality at all or with the bottom-up design approach. OTOH if you think this is worth pursuing, please give a 👍.