Delete rows from pandas DataFrame with non-unique index

Question

I am looking for a way to delete rows in a pandas DataFrame when the index is not guaranteed to be unique.

So, I want to drop items 0 and 4 from my DataFrame df. This would be the typical code you would use to do that:

df.drop([0, 4].index)

If each index is unique, this works fine. However, if items 0, 1, and 2 all have the same index, this code drops items 0, 1, 2, and 4, instead of just 0 and 4.

My DataFrame is set up this way for good reasons, so I don't want to restructure my data, which looks approximately like this:

        age
site             
mc03    0.39
mc03    0.348
mc03    0.348
mc03    0.42
mc04    0.78

I tried:

del df.iloc[0]

but this fails with:

AttributeError: __delitem__

Any other suggestions for how to accomplish this task?

Update:

I found two ways to do it, but neither is particularly elegant.

to_drop = [0, 4]
df = df.iloc[sorted(set(range(len(df))) - set(to_drop))]
# or:
df = df.iloc[[i for i in range(len(df)) if i not in to_drop]]

Maybe this is as good as it's going to get, though?

Answer 1

This is not very elegant too, but let me post it as an alternative:

df = df.reset_index().drop([0, 4]).set_index("site")

It temporarily changes the index to a regular index, drops the rows and sets the original index back. The idea is from this answer .

Answer 2

alternative solution (using numpy):

In [252]: mask = np.ones(len(df)).astype(bool)

In [253]: mask[[0,4]] = False

In [254]: mask
Out[254]: array([False,  True,  True,  True, False], dtype=bool)

In [255]: df[mask]
Out[255]:
        age
mc03  0.348
mc03  0.348
mc03  0.420

Delete rows from pandas DataFrame with non-unique index

Question

2 answers

solution1
3 ACCPTED 2016-06-29 17:33:29

solution2
1 2016-06-29 22:42:20

Delete rows from pandas DataFrame with non-unique index

Question

2 answers

solution1 3 ACCPTED 2016-06-29 17:33:29

solution2 1 2016-06-29 22:42:20

solution1
3 ACCPTED 2016-06-29 17:33:29

solution2
1 2016-06-29 22:42:20