Python 解決方案將列值列出

Question

我有如下數據：

df = pd.DataFrame({'column1': ['Y', 'Y', 'Y'],
                   'value_5': ['N', 'Y', 'Y'],
                   'value_6': ['N', 'Y', 'N'],
                   'value_10': ['Y', 'N', 'N'],
                   'value_20': ['N', 'N', 'Y']},
                  index=['key1','key2','key4'])
print(df)
     column1 value_5 value_6 value_10 value_20
key1       Y       N       N        Y        N
key2       Y       Y       Y        N        N
key4       Y       Y       N        N        Y

我想根據該數據創建最后一列。 但是每次運行的列數和值可能不同。

Answer 1

首先從value_選擇的子字符串DataFrame.filter的列中提取整數，然后按Y比較值，如果匹配則將列名轉換為列表：

f = lambda x: int(x.split('_')[-1])
df1 = df.filter(like='value_').rename(columns=f)

df['new'] = df1.eq('Y').agg(lambda x: x.index[x].tolist(), axis=1)

對於列名和列表中的整數，正則表達式的另一個想法是使用列表理解：

import re
f = lambda x: next(map(int,re.findall(r'\d+',x)))
df1 = df.filter(like='value_').rename(columns=f)

df['new'] = [df1.columns[x].tolist() for x in df1.eq('Y').to_numpy()]

或使用Series.str.extract ：

df1 = df.filter(like='value_')
df1.columns = df1.columns.str.extract('(\d+)', expand=False).astype(int)

df['new'] = [df1.columns[x].tolist() for x in df1.eq('Y').to_numpy()]
print (df)
     column1 value_5 value_6 value_10 value_20      new
key1       Y       N       N        Y        N     [10]
key2       Y       Y       Y        N        N   [5, 6]
key4       Y       Y       N        N        Y  [5, 20]

Answer 2

假設 df 是這樣的：

import pandas as pd

df = pd.DataFrame({'column1': ['Y', 'Y', 'Y'],
                   'value_5': ['N', 'Y', 'Y'],
                   'value_6': ['N', 'Y', 'N'],
                   'value_10': ['Y', 'N', 'N'],
                   'value_20': ['N', 'N', 'Y']  })
print(df)
  column1 value_5 value_6 value_10 value_20
0       Y       N       N        Y        N
1       Y       Y       Y        N        N
2       Y       Y       N        N        Y

我已經提取了存在Y的列索引

expected = []
df1 = df[df.columns[['value' in c for c in df.columns]]]
for i in range(len(df1)):
    idx = df1.iloc[i, :][df1.iloc[i, :]=='Y'].index
    expected.append([int(e.split('_')[-1]) for e in idx])
df['Expected'] = expected
print(df)

  column1 value_5 value_6 value_10 value_20 Expected
0       Y       N       N        Y        N     [10]
1       Y       Y       Y        N        N   [5, 6]
2       Y       Y       N        N        Y  [5, 20]

Answer 3

這是一種定義 function 以檢查值'Y'方法：

def check_y(row):
    checklis = [k for k,v in zip(row.index, row.values) if v=='Y']
    return [int(k.split('_')[1]) for k in checklis]            

df1 = df.filter(like='value_')
df['Expected'] = df1.apply(lambda row: check_y(row), axis=1)
print(df)

     column1 value_5 value_6 value_10 value_20 Expected
key1       Y       N       N        Y        N     [10]
key2       Y       Y       Y        N        N   [5, 6]
key4       Y       Y       N        N        Y  [5, 20]

Python 解決方案將列值列出

問題描述

3 個解決方案

解決方案1
1 2022-09-14 06:28:59

解決方案2
0 2022-09-14 06:52:05

解決方案3
0 2022-09-15 00:30:48

Python 解決方案將列值列出

問題描述

3 個解決方案

解決方案1 1 2022-09-14 06:28:59

解決方案2 0 2022-09-14 06:52:05

解決方案3 0 2022-09-15 00:30:48

解決方案1
1 2022-09-14 06:28:59

解決方案2
0 2022-09-14 06:52:05

解決方案3
0 2022-09-15 00:30:48