Description
Issue with current documentation:
Please forgive this as-yet-unformed question - but I am struggling with the documentation for quantile
. The documentation has this:
"""
Given a vector V of length n, the q-th quantile of V is the value q of the way from the minimum to the maximum in a sorted copy of V.
"""
However - it's not clear to me what this means.
Assume in what follows that V is sorted.
First "minimum" and "maximum" would normally refer to min(V)
and max(V)
but I'm sure we actually mean the minimum and maximum index - so 0 and n - 1
in Python terms. So I think this is referring to the distance along the line starting at 0 and ending at n-1, and therefore the "quantile positions" (I made up this term) corresponding to each value in V are:
qps = np.arange(n) / (n - 1)
Is that correct?
Is it correct to use the term "plotting positions" for my "quantile positions" qps
?
Next we have this:
"""
The values and distances of the two nearest neighbors as well as the method parameter will determine the quantile if the normalized ranking does not match the location of q exactly.
"""
I was interested in the cases where the normalized ranking does match the location of q exactly - where I'm assuming the "location of q" means my qps
above.
So I'm taking q for each element V[i]
to be qps[i]
- is that right? Therefore q = np.quantile(V, qps[i], method=m) == V[i]
for all valid values of m?
If so, it's easy to show that, for many of the methods, that isn't the case.
Just as an example:
n = 4
V = np.arange(n) # The values.
qps = np.arange(n) / (n - 1) # Quantile positions
# linear is the default
assert np.allclose( # Passes
np.quantile(V, qps, method='linear'), V)
assert np.allclose(. # Fails
np.quantile(V, qps, method='closest_observation'), V)
Have I interpreted these correctly? If so, or even if not, maybe the docstring needs a rewrite?
Idea or request for content:
No response