[英]How to use multiple conditions based on 2 columns to create the new column in pandas?
[英]How to create a new pandas column by scanning cross multiple columns using for-loops?
我有25個變量DXCODE1到DXCODE25,我想掃描它們以查看每一行的這些值是否與icd_list相匹配。 例如,在每一行中,我都希望從DXCODE1到DXCODE25進行掃描,看是否其中任何一個包含以下三個值之一:'F32','F33','F34',如果包含,那么我想要返回1.我嘗試了以下操作:
def scan_icd (row):
icd_list = ['F32', 'F33', 'F34']
for i in range(1, 26):
dx_code_loc = 'DXCODE' + str(i)
for j in range(0, len(icd_list)):
if icd_list[j] in row[dx_code_loc]:
return 1
df['ICD_DX'] = df.apply(scan_icd, axis=1)
但是我得到了這個錯誤:
TypeError: ("argument of type 'float' is not iterable", 'occurred at index 1')
另外,我想使其變得靈活,以便可以以某種方式將icd代碼指定為參數中的列表。 但是我不知道如何在語法上應用:
def scan_icd (row, icd_list):
icd_list = icd_list
for i in range(1, 26):
dx_code_loc = 'DXCODE' + str(i)
for j in range(0, len(icd_list)):
if icd_list[j] in row[dx_code_loc]:
return 1
df['ICD_DX'] = df.apply(scan_icd (['F32', 'F33', 'F34']), axis=1)
TypeError: apply() got multiple values for argument 'axis'
===================
編輯:
這些列標記為DXCODE1,DXCODE2,... DXCODE25
我認為這apply
於您想要的工作
icd_list = ['F32', 'F33', 'F34']
df['ICD_DX'] = df.apply(lambda row: 1 if row.isin(icd_list).any() else 0, axis=1)
您檢查icd_list
任何元素是否在您的row
編輯:如果您想保留for循環(對不起,我起初沒有看到此要求),我會這樣做:
def scan_icd (row, icd_list):
for i in range(1, 26):
dx_code_loc = 'DXCODE' + str(i)
for j in range(0, len(icd_list)):
if icd_list[j] in row[dx_code_loc]:
return 1
return 0 # return 0 if none match
icd_list = ['F32', 'F33', 'F34']
df['ICD_DX'] = df.apply(scan_icd, args=([icd_list]), axis=1)
# note the list of the list icd_list in args
編輯2:要指定列,您可以執行以下操作:
list_col = ['DXCODE' + str(i) for i in range(1,26)]
df['ICD_DX'] = df.apply(lambda row: 1 if row[list_col].isin(icd_list).any() else 0, axis=1)
# see the difference is with row[list_col]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.