[英]How to Compare strings of 1 column with strings of another within the same dataframe, calculate the percentage of strings matching in result columns
[英]How to remove strings from a column matching with strings of another column of dataframe?
我必須有兩個數據框第一個: df
df1 = pd.DataFrame({
'Sample': ['Sam1', 'Sam2', 'Sam3'],
'Value': ['ak,b,c,k', 'd,k,e,b,f,a', 'am, x,y,z,a']
})
df1
看起來像:
Sample Value
0 Sam1 ak,b,c,k
1 Sam2 d,k,e,b,f,a
2 Sam3 am,x,y,z,a
第二個: df2
df2 = pd.DataFrame({
'Remove': ['ak', 'b', 'k', 'a', 'am']})
df2
看起來像:
Remove
0 ak
1 b
2 k
3 a
4 am
我想從df1['Value']
中刪除與df2['Remove']
匹配的字符串
預期的 output 是:
Sample Value
Sam1 c
Sam2 d,e,f
Sam3 x,y,z
這段代碼對我沒有幫助
任何幫助,謝謝
您可以使用apply()
刪除 df1 Value
列中的項目(如果它位於 df2 Remove
列中)。
import pandas as pd
df1 = pd.DataFrame({
'Sample': ['Sam1', 'Sam2', 'Sam3'],
'Value': ['ak,b,c,k', 'd,k,e,b,f,a', 'am, x,y,z,a']
})
df2 = pd.DataFrame({'Remove': ['ak', 'b', 'k', 'a', 'am']})
remove_list = df2['Remove'].values.tolist()
def remove_value(row, remove_list):
keep_list = [val for val in row['Value'].split(',') if val not in remove_list]
return ','.join(keep_list)
df1['Value'] = df1.apply(remove_value, axis=1, args=(remove_list,))
print(df1)
Sample Value
0 Sam1 c
1 Sam2 d,e,f
2 Sam3 x,y,z
使用apply
作為 1 襯墊
df1['Value'] = df1['Value'].str.split(',').apply(lambda x:','.join([i for i in x if i not in df2['Remove'].values]))
Output:
>>> df1
Sample Value
0 Sam1 c
1 Sam2 d,e,f
2 Sam3 x,y,z
這個腳本會幫助你
for index, elements in enumerate(df1['Value']):
elements = elements.split(',')
df1['Value'][index] = list(set(elements)-set(df2['Remove']))
只需迭代數據框並使用刪除數組獲取數組的差異,如下所示
完整的代碼將是這樣的
import pandas as pd
df1 = pd.DataFrame({
'Sample': ['Sam1', 'Sam2', 'Sam3'],
'Value': ['ak,b,c,k', 'd,k,e,b,f,a', 'am,x,y,z,a']
})
df2 = pd.DataFrame({
'Remove': ['ak', 'b', 'k', 'a', 'am']})
for index, elements in enumerate(df1['Value']):
elements = elements.split(',')
df1['Value'][index] = list(set(elements)-set(df2['Remove']))
print(df1)
output
Sample Value
0 Sam1 [c]
1 Sam2 [e, d, f]
2 Sam3 [y, x, z]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.