Skip to main content
  1. About
  2. For Teams
Asked
Viewed 11k times
6

I have a pandas dataframe like this..

df = pd.DataFrame({'A' : [5,6,3,4,4,5,6,7,12,13], 'B' : 
     [1,2,3,5,5,6,7,8,9,10,]})

df

    A   B
0   5   1
1   6   2
2   3   3
3   4   5
4   4   5
5   5   6
6   6   7  
7   7   8
8  12   9
9  13  10

and I have an array of indices

array = np.array([0,1,2,4,7,8])

Now I can subset the dataframe with the array indices like this

df.iloc[array]

Which gives me a dataframe with indices present in the array.

    A  B
0   5  1
1   6  2
2   3  3
4   4  5
7   7  8
8  12  9

Now I want all the rows which are not present in the array index, row index which i want is [3,5,6,9] I am trying to do something like this but it gives me an error.

df.iloc[~loc]

2 Answers 2

7

You can use isin with inverting a boolean Series by ~:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A' : [5,6,3,4,4,5,6,7,12,13], 'B' : 
     [1,2,3,5,5,6,7,8,9,10,]})

print df
    A   B
0   5   1
1   6   2
2   3   3
3   4   5
4   4   5
5   5   6
6   6   7
7   7   8
8  12   9
9  13  10

array = np.array([0,1,2,4,7,8])
print array
[0 1 2 4 7 8]

print df.index.isin(array)
[ True  True  True False  True False False  True  True False]

print ~df.index.isin(array)
[False False False  True False  True  True False False  True]

print df[ ~df.index.isin(array)]
    A   B
3   4   5
5   5   6
6   6   7
9  13  10
Sign up to request clarification or add additional context in comments.

Comments

2

Use sets:

In [7]: wanted = [0, 1, 2, 4, 7, 8]

In [8]: not_wanted = set(df.index) - set(wanted)

In [9]: not_wanted
Out[9]: {3, 5, 6, 9}

In [11]: not_wanted = list(not_wanted)

In [12]: not_wanted
Out[12]: [9, 3, 5, 6]

In [13]: df.iloc[not_wanted]
Out[13]: 
    A   B
9  13  10
3   4   5
5   5   6
6   6   7

5 Comments

I have an array index not a list. When I use your code it gives me an error. TypeError: unhashable type: 'numpy.ndarray'
In the prompt [7], I set it up as a list, but you can also cast the array index as a list. Use list(array_index) for this purpose.
You may wish to cast the entire array_index into a set in one line by going: set(list(array_index)). From that point, the set operations are the quickest way to get to the complement of the set of indices that you don't want.
ok. It worked. But why its not working if I have an array like this (array([ 2, 3, 4], dtype=int64))
That is because a tuple is also not hashable.

Your Answer

Post as a guest

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Morty Proxy This is a proxified and sanitized view of the page, visit original site.