Description
Bug report
When feeding the same data to violin plot in list or in numpy array, the result is not the same.
Code for reproduction
import matplotlib.pyplot as plt
import numpy as np
data = [ [1, 2, 3] , [4, 5, 6] ]
data_np = np.array(data)
f, (ax1, ax2) = plt.subplots(2)
ax1.violinplot(data)
ax1.set_title('list')
ax2.set_title('numpy array')
ax2.violinplot(data_np) #Same data but in numpy
plt.show()
Actual outcome
Expected outcome
Both should be the same isn't it ?
Actually the doc specify - in terms I didn't understood at first - that it will
"Make a violin plot for each column of dataset or each vector in sequence dataset. "
It is more clear when looking at the hist doc, where one can read at the end
"Note that the ndarray form is transposed relative to the list form."
The difference between the two outcomes, I think, is the way of thinking at data : vectors in matrix or lists in list (which correspond to columns in array of row in array).
I discussed with some collegues who are working with Matlab, and they do think the vector as base unit of the 2D matrix. For me the base unit is a row of a 2D array.
In the function plot when plotting [ [1,2,3], [4,5,6] ], we have 3 curves, so I thing it is the matrix/vector way of thinking that is predominant in matplotlib.
So should we change the violinplot and histogram behavior for list to work the same as array ?
After writing this I do belive that the best solution is to just mention that specificity more clearly in the doc of violinplot, not changing the code. I can do this, but I would like some feedback from differents points of view before.
Best,
RP