熊貓：遍歷現有列並根據條件創建新列

Question

與我的問題相關的問題的最佳版本位於此處。 但是我在某個地方遇到了麻煩。

我的數據框：

df = pd.DataFrame({'KEY': ['100000003', '100000009', '100000009', '100000009'], 
              'RO_1': [1, 1, 4,1],
              'RO_2': [1, 0, 0,0],
              'RO_3': [1, 1, 1,1],
              'RO_4': [1, 4, 1,1]})

    KEY         RO_1  RO_2   RO_3 RO_4 
0   100000003   1      1     1    1   
1   100000009   1      0     1    4    
2   100000009   4      0     1    1    
3   100000009   1      0     1    1

我想創建3個附加列，分別標記為“ Month1”，“ Month2”和“ Month4”。 很簡單的東西：

for i in range(3):
    df.loc[1,'Month'+str(i)] = 1 # '1' is just there as a place holder

盡管執行此代碼時收到警告消息：

"A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead"

我想將其與條件條件結合起來以填充每一列和每一行的每個單元格。

下面的代碼將創建一個單列並根據條件標記，如果任何帶有RO_的列都具有該條件

namelist = df.columns.get_values().tolist()
ROList = [s for s in namelist if "RO_" in s]
for col in ROList:
    for i in range(3):
        df['Month'] = np.where(np.logical_or(df[col]==4,df[col]==1), '1', '0') 
df

我很想將這兩個代碼結合起來，但是我對如何做到這一點缺乏基本的了解。 任何幫助都會很棒。

最終預期結果：

    KEY         RO_1  RO_2   RO_3 RO_4 Month1 Month2 Month3 Month4
0   100000003   1      1     1    1    1      1      1      1
1   100000009   1      0     1    4    1      0      1      1
2   100000009   4      0     1    1    1      0      1      1  
3   100000009   1      0     1    1    1      0      1      1

Answer 1

IIUC enumerate

namelist = df.columns.get_values().tolist()
ROList = [s for s in namelist if "RO_" in s]
for i,col in enumerate(ROList):

    df['Month'+str(i+1)] = np.where(np.logical_or(df[col]==4,df[col]==1), '1', '0')
df
Out[194]: 
         KEY  RO_1  RO_2  RO_3  RO_4 Month1 Month2 Month3 Month4
0  100000003     1     1     1     1      1      1      1      1
1  100000009     1     0     1     4      1      0      1      1
2  100000009     4     0     1     1      1      0      1      1
3  100000009     1     0     1     1      1      0      1      1

您的邏輯似乎變成了4比1

df.assign(**df.loc[:,ROList].mask(df.loc[:,ROList]==4,1).rename(columns=dict(zip(ROList,list(range(1,len(ROList)+1))))).add_prefix('Month'))
Out[15]: 
         KEY  RO_1  RO_2  RO_3  RO_4  Month1  Month2  Month3  Month4
0  100000003     1     1     1     1       1       1       1       1
1  100000009     1     0     1     4       1       0       1       1
2  100000009     4     0     1     1       1       0       1       1
3  100000009     1     0     1     1       1       0       1       1

Answer 2

使用filter + isin + rename ，為您的數據的單一流水線改造。

v = (df.filter(regex='^RO_')    # select columns
      .isin([4, 1])             # check if the value is 4 or 1
      .astype(int)              # convert the `bool` result to `int`
      .rename(                  # rename columns
          columns=lambda x: x.replace('RO_', 'Month')
      ))

或者，為了表現，

v = df.filter(regex='^RO_')\
          .isin([4, 1])\
          .astype(int) 
v.columns = v.columns.str.replace('RO_', 'Month')

最后， concat enate與原來的結果。

pd.concat([df, v], axis=1)

         KEY  RO_1  RO_2  RO_3  RO_4  Month1  Month2  Month3  Month4
0  100000003     1     1     1     1       1       1       1       1
1  100000009     1     0     1     4       1       0       1       1
2  100000009     4     0     1     1       1       0       1       1
3  100000009     1     0     1     1       1       0       1       1

Answer 3

似乎您正在為數據框中的每個現有列創建一個新列。 您可以執行以下操作：

original_cols = df.columns
for c in original_cols:
    cname = "Month" + c.split("_")[-1]
    df[cname] = df[c].apply(lambda x: 1 if (x == 1) or (x == 4) else 0)

熊貓：遍歷現有列並根據條件創建新列

問題描述

3 個解決方案

解決方案1
2 2018-02-08 21:19:28

解決方案2
2 已采納 2018-02-08 21:21:06

解決方案3
0 2018-02-08 21:20:20

熊貓：遍歷現有列並根據條件創建新列

問題描述

3 個解決方案

解決方案1 2 2018-02-08 21:19:28

解決方案2 2 已采納 2018-02-08 21:21:06

解決方案3 0 2018-02-08 21:20:20

解決方案1
2 2018-02-08 21:19:28

解決方案2
2 已采納 2018-02-08 21:21:06

解決方案3
0 2018-02-08 21:20:20