简体   繁体   English

按数字访问最右边的两列熊猫数据框

[英]Access rightmost two columns of pandas dataframe, by number

I'm hoping to overwrite some values in a df when a condition is met.我希望在满足条件时覆盖df一些值。 Using the df below, when col B is equal to values in lst , I want to replace values in col C,D with X .使用下面的df ,当col B等于lst值时,我想用X替换col C,D值。

This can achieved using the method below but I'm hoping to use indexing to select the last two columns, rather than using hard coded labels.这可以使用下面的方法来实现,但我希望使用索引来选择最后两列,而不是使用硬编码标签。

df = pd.DataFrame({   
    'A' : [1,1,1,1,1,1,1,1],             
    'B' : ['X','Foo','X','Cat','A','A','X','D'],                 
    'C' : [1,1,1,1,1,1,1,1],    
    'D' : [1,1,1,1,1,1,1,1],            
    })

lst = ['Foo','Cat']
df.loc[df.B.isin(lst), ['C','D']] = 'X'

Attempt:试图:

df.loc[df.B.isin(lst), df.loc[:-2]] = 'X'

Intended:故意的:

   A    B  C  D
0  1    X  1  1
1  1  Foo  X  X
2  1    X  1  1
3  1  Cat  X  X
4  1    A  1  1
5  1    A  1  1
6  1    X  1  1
7  1    D  1  1

If I understood the question, it looks like you are searching for iloc :如果我理解了这个问题,看起来您正在搜索iloc

df.iloc[df.B.isin(lst).values, -2:] = 'X'

In most cases, df.loc[df.B.isin(lst), -2:] = 'X' will also return the same result, but the interpretation of -2: slice will vary if the column names are of integer type.在大多数情况下, df.loc[df.B.isin(lst), -2:] = 'X'也将返回相同的结果,但-2: slice 的解释会因列名称为整数类型而有所不同.

Here's my best attempt at a pandas-y approach, with the super-ugly and less performant apply method further down below:这是我对 pandas-y 方法的最佳尝试,下面是超级丑陋且性能较差的apply方法:

import pandas as pd

df = pd.DataFrame({   
    'A' : [1,1,1,1,1,1,1,1],             
    'B' : ['X','Foo','X','Cat','A','A','X','D'],                 
    'C' : [1,1,1,1,1,1,1,1],    
    'D' : [1,1,1,1,1,1,1,1],            
})

# get working index once,
# where column "B" in lst,
# store
ind = df["B"].isin(lst)

# get working slice with index
df_slice = df[ind]

# set the "C" and "D" columns
df_slice["C"], df_slice["D"] = "X", "X"

# set the original df slice
# to our working slice
df[df.B.isin(lst)] = df_slice

print(df)

### PRINTS:

   A    B  C  D
0  1    X  1  1
1  1  Foo  X  X
2  1    X  1  1
3  1  Cat  X  X
4  1    A  1  1
5  1    A  1  1
6  1    X  1  1
7  1    D  1  1

And here's the row-by-row apply approach.这是逐行apply方法。 It's not the prettiest solution, but it gets the job done.这不是最漂亮的解决方案,但它完成了工作。 Note that this just replaces the rest of the row with Xs.请注意,这只是用 Xs 替换了行的其余部分。

import pandas as pd

df = pd.DataFrame({   
    'A' : [1,1,1,1,1,1,1,1],             
    'B' : ['X','Foo','X','Cat','A','A','X','D'],                 
    'C' : [1,1,1,1,1,1,1,1],    
    'D' : [1,1,1,1,1,1,1,1],            
})


def apply_function(row):
    lst = ['Foo','Cat']
    return row if row["B"] not in lst else [
        # first two rows
        *row[:2],
        # Xs for the rest of the row
        *["X" for r in range(len(row) - 2)]
    ]

df = df.apply(apply_function, axis=1)
print(df)

### PRINTS:

   A    B  C  D
0  1    X  1  1
1  1  Foo  X  X
2  1    X  1  1
3  1  Cat  X  X
4  1    A  1  1
5  1    A  1  1
6  1    X  1  1
7  1    D  1  1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM