Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Add an average callable parameter to sklearn.neighbors.KNeighborsRegressor #22626

Copy link
Copy link
Open
@raymondj-pace

Description

@raymondj-pace
Issue body actions

Describe the workflow you want to enable

For some algorithms of kNN it is desired to use other calculations at the end of this method besides using the mean. In some cases using a geometric average or arithmetic-geometric average is needed instead of the regular average.

Describe your proposed solution

Add a parameter to the KNeighborsRegressor class, for computing the average at the end of the predict() method - something like:

# Use the default way to compute the average with a new parameter and string value of 'mean' which would be the default

KNeighborsRegressor(n_neighbors=5, metric=manhattan_distance, weights='uniform', average='mean')
# Allow callables for the new average parameter...

def geometric_mean(l):
    k=1
    for i in l:
        k*=i
        
    return pow(k, 1/len(l))

KNeighborsRegressor(n_neighbors=5, metric=manhattan_distance, weights='uniform', average=geometric_mean)





# Allow callables for the new average parameter...

def arithmetic_geometric_mean(l):
    tolerance=1e-10
    a0 = mean(l)
    g0 = geometric_mean(l)
    
    an, gn = (a0 + g0) / 2.0, math.sqrt(a0 * g0)
    while abs(an - gn) > tolerance:
        an, gn = (an + gn) / 2.0, math.sqrt(an * gn)
    
    return an

KNeighborsRegressor(n_neighbors=5, metric=manhattan_distance, weights='uniform', average=arithmetic_geometric_mean)

Describe alternatives you've considered, if relevant

Not use scikit-learn's KNeighborsRegressor and build my own:

#
# Here avg_func is the callable method to compute the average
#
def kNN_regressor(q, R, truth, k, avg_func=my_mean, distance_func=euclidean_distance, **kwargs):
    idx = [0]*k
    dist = [0]*k
    
    for i in range(k):
        idx[i] = i
        if 'minkowski_param' in kwargs:
            mink_p = kwargs['minkowski_param']
            dist[i] = distance_func(q, R[i], mink_p)
        else:
            dist[i] = distance_func(q, R[i])
        
    max_idx = dist.index(max(dist))
        
    for i in range(k, len(R)):
        if 'minkowski_param' in kwargs:
            mink_p = kwargs['minkowski_param']
            d = distance_func(q, R[i], mink_p)
        else:
            d = distance_func(q, R[i])

        if d < dist[max_idx]:
            dist[max_idx] = d
            idx[max_idx] = i
            max_idx = dist.index(max(dist))
    
    t = []
    for i in range(k):
        t.append(truth[idx[i]])
        
    return avg_func(t)

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      Morty Proxy This is a proxified and sanitized view of the page, visit original site.