Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

Adding the median option to the KNN imputer. #27890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
Loading
from

Conversation

blueprintparadise
Copy link

@blueprintparadise blueprintparadise commented Dec 2, 2023

Current version only supports mean but there seems to be a median version showing good results. So , I added it to the repo using the "strategy" argument

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Imputer could only have a mean option in the earlier version.

Any other comments?

Apologies if I made some error in the pull request, I am still learning.
Please let me know if there is anything I missed or needs to be changed.

Current version only supports mean but  there seems to be a median version showing good results. So , I added it to the repo using the "strategy" argument
Copy link

github-actions bot commented Dec 2, 2023

❌ Linting issues

This PR is introducing linting issues. Here's a summary of the issues. Note that you can avoid having linting issues by enabling pre-commit hooks. Instructions to enable them can be found here.

You can see the details of the linting issues under the lint job here


black

black detected issues. Please run black . locally and push the changes. Here you can see the detected issues. Note that running black might also fix some of the issues which might be detected by ruff. Note that the installed black version is black=23.3.0.


--- /home/runner/work/scikit-learn/scikit-learn/sklearn/impute/_knn.py	2023-12-04 12:41:42.565797 +0000
+++ /home/runner/work/scikit-learn/scikit-learn/sklearn/impute/_knn.py	2023-12-04 12:41:59.939548 +0000
@@ -208,11 +208,11 @@
 
         if self.strategy == "mean":
             imputed_values = np.ma.average(donors, axis=1, weights=weight_matrix).data
         elif self.strategy == "median":
             imputed_values = np.ma.median(donors, axis=1).data
-            
+
         return imputed_values
 
     @_fit_context(prefer_skip_nested_validation=True)
     def fit(self, X, y=None):
         """Fit the imputer on X.
would reformat /home/runner/work/scikit-learn/scikit-learn/sklearn/impute/_knn.py

Oh no! 💥 💔 💥
1 file would be reformatted, 907 files would be left unchanged.

ruff

ruff detected issues. Please run ruff --fix --show-source . locally, fix the remaining issues, and push the changes. Here you can see the detected issues. Note that the installed ruff version is ruff=0.1.6.


sklearn/impute/_knn.py:213:1: W293 [*] Blank line contains whitespace
    |
211 |         elif self.strategy == "median":
212 |             imputed_values = np.ma.median(donors, axis=1).data
213 |             
    | ^^^^^^^^^^^^ W293
214 |         return imputed_values
    |
    = help: Remove whitespace from blank line

Found 1 error.
[*] 1 fixable with the `--fix` option.

Generated for commit: 99eb9f5. Link to the linter CI: here

@glemaitre
Copy link
Member

The CIs are failing. We also need unit tests to check the implementation works as expected.
Do you have any scientific literature supporting this strategy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
Morty Proxy This is a proxified and sanitized view of the page, visit original site.