Python DF：如何根据条件为一行选择多列中的值？

Question

Here I have a dataset with several codes for one record. 这里我有一个数据集，其中包含一个记录的几个代码。 I need to pick up the codes which start with '6' and put them in a new column for each record. 我需要选择以'6'开头的代码并将它们放入每个记录的新列中。

The Dataframe looks like this: Dataframe看起来像这样：

ID   Code1   Code2   Code3   Code4   Code5   Code6
1    64774    NaN     NaN     NaN     NaN     NaN
2    60240   95868    NaN     NaN     NaN     NaN
3    36500   60500   95867    NaN     NaN     NaN
4    19125   19301   36500    NaN     NaN     NaN
5    36500   60500   60520    95868   95869   NaN
6    31528   31622   36500    43235   60500   60520

# Create the dataframe
d = {'ID': ['1', '2', '3', '4', '5', '6'], 
     'Code1': ['64774','60240','36500','19125','36500','31528'],
     'Code2': [np.nan,'95868','60500','19301','60500','31622'],
     'Code3': [np.nan,np.nan,'95867','36500','60520','36500'],
     'Code4': [np.nan,np.nan,np.nan,np.nan,'95868','43235'],
     'Code5': [np.nan,np.nan,np.nan,np.nan,'95869','60500'],
     'Code6': [np.nan,np.nan,np.nan,np.nan,np.nan,'60520'],
     } 
df = pd.DataFrame(data=d)

I thought about loop or function like: 我想到循环或功能像：

def myfunc(row):
    if row['Code1'].str.startswith('6'):
       return row['Code1']

but I'm not quite sure how to run the fuction for all 6 columns (Code1 - Code6) in one function, and put all selected code together as 1 value. 但我不太确定如何在一个函数中运行所有6列（Code1 - Code6）的功能，并将所有选定的代码放在一起作为1值。

What I'm looking for is: 我正在寻找的是：

ID   Code1   Code2   Code3   Code4   Code5   Code6      New_Col
1    64774    NaN     NaN     NaN     NaN     NaN        64774
2    60240   95868    NaN     NaN     NaN     NaN        60240
3    36500   60500   95867    NaN     NaN     NaN        60500
4    19125   19301   36500    NaN     NaN     NaN         NaN
5    36500   60500   60520    95868   95869   NaN      60500, 60520
6    31528   31622   36500    43235   60500   60520    60500, 60520

Thanks in advance! 提前致谢！

Answer 1

you can try this 你可以试试这个

d = {'ID': ['1', '2', '3', '4', '5', '6'], 
     'Code1': ['64774','60240','36500','19125','36500','31528'],
     'Code2': [np.nan,'95868','60500','19301','60500','31622'],
     'Code3': [np.nan,np.nan,'95867','36500','60520','36500'],
     'Code4': [np.nan,np.nan,np.nan,np.nan,'95868','43235'],
     'Code5': [np.nan,np.nan,np.nan,np.nan,'95869','60500'],
     'Code6': [np.nan,np.nan,np.nan,np.nan,np.nan,'60520'],
     } 

df = pd.DataFrame(data=d)

df['Code7'] = [[] for _ in range(len(df))]


for i in df.index : 
  row = df.drop('ID',axis=1).copy().loc[i]
  for val in row : 
      if isinstance(val,str) and val.startswith('6') : 
        df.Code7[i].append(val)

print(df)

i hope it helps 我希望它有所帮助

Python DF：如何根据条件为一行选择多列中的值？

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-04-25 17:13:27

Python DF：如何根据条件为一行选择多列中的值？

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-04-25 17:13:27

解决方案1
0 已采纳 2019-04-25 17:13:27