简体   繁体   中英

Access rightmost two columns of pandas dataframe, by number

I'm hoping to overwrite some values in a df when a condition is met. Using the df below, when col B is equal to values in lst , I want to replace values in col C,D with X .

This can achieved using the method below but I'm hoping to use indexing to select the last two columns, rather than using hard coded labels.

df = pd.DataFrame({   
    'A' : [1,1,1,1,1,1,1,1],             
    'B' : ['X','Foo','X','Cat','A','A','X','D'],                 
    'C' : [1,1,1,1,1,1,1,1],    
    'D' : [1,1,1,1,1,1,1,1],            
    })

lst = ['Foo','Cat']
df.loc[df.B.isin(lst), ['C','D']] = 'X'

Attempt:

df.loc[df.B.isin(lst), df.loc[:-2]] = 'X'

Intended:

   A    B  C  D
0  1    X  1  1
1  1  Foo  X  X
2  1    X  1  1
3  1  Cat  X  X
4  1    A  1  1
5  1    A  1  1
6  1    X  1  1
7  1    D  1  1

If I understood the question, it looks like you are searching for iloc :

df.iloc[df.B.isin(lst).values, -2:] = 'X'

In most cases, df.loc[df.B.isin(lst), -2:] = 'X' will also return the same result, but the interpretation of -2: slice will vary if the column names are of integer type.

Here's my best attempt at a pandas-y approach, with the super-ugly and less performant apply method further down below:

import pandas as pd

df = pd.DataFrame({   
    'A' : [1,1,1,1,1,1,1,1],             
    'B' : ['X','Foo','X','Cat','A','A','X','D'],                 
    'C' : [1,1,1,1,1,1,1,1],    
    'D' : [1,1,1,1,1,1,1,1],            
})

# get working index once,
# where column "B" in lst,
# store
ind = df["B"].isin(lst)

# get working slice with index
df_slice = df[ind]

# set the "C" and "D" columns
df_slice["C"], df_slice["D"] = "X", "X"

# set the original df slice
# to our working slice
df[df.B.isin(lst)] = df_slice

print(df)

### PRINTS:

   A    B  C  D
0  1    X  1  1
1  1  Foo  X  X
2  1    X  1  1
3  1  Cat  X  X
4  1    A  1  1
5  1    A  1  1
6  1    X  1  1
7  1    D  1  1

And here's the row-by-row apply approach. It's not the prettiest solution, but it gets the job done. Note that this just replaces the rest of the row with Xs.

import pandas as pd

df = pd.DataFrame({   
    'A' : [1,1,1,1,1,1,1,1],             
    'B' : ['X','Foo','X','Cat','A','A','X','D'],                 
    'C' : [1,1,1,1,1,1,1,1],    
    'D' : [1,1,1,1,1,1,1,1],            
})


def apply_function(row):
    lst = ['Foo','Cat']
    return row if row["B"] not in lst else [
        # first two rows
        *row[:2],
        # Xs for the rest of the row
        *["X" for r in range(len(row) - 2)]
    ]

df = df.apply(apply_function, axis=1)
print(df)

### PRINTS:

   A    B  C  D
0  1    X  1  1
1  1  Foo  X  X
2  1    X  1  1
3  1  Cat  X  X
4  1    A  1  1
5  1    A  1  1
6  1    X  1  1
7  1    D  1  1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM