[英]How to find text coincidences in pandas dataframe column and counting it
我有一個這樣的熊貓數據框:
Text like=>apple not=>here
i like apple 0 0
i do not like pears 0 0
one two three 0 0
something here 0 0
something not here 0 0
vla bla bla 0 0
我需要將列填充為:
Text like=>apple not=>here
i like apple 1 0
i do not like pears 0 0
one two three 0 0
something here 0 0
something not here 0 1
vla bla bla 0 0
我不知道除“文本”列名稱以外的列名稱,我需要獲取列名稱並計算“文本”列的數據中的文本重合。
我唯一的想法是將列表中除“文本”列以外的所有列都通過逐行和逐列的名稱進行迭代並填充數據,但是我想存在一種更好的方法。
不用iterrows
,您可以按列實現此向量化。
In [41]: df.columns.to_series().drop('Text').values
Out[41]: array(['like=>apple', 'not=>here'], dtype=object)
In [42]: for ele in df.columns.to_series().drop('Text'):
...: column_name = ele.replace('=>', ' ')
...: df[ele] = df.Text.str.count(column_name)
...:
...:
In [43]: df
Out[43]:
Text like=>apple not=>here
0 i like apple 1 0
1 i do not like pears 0 0
2 one two three 0 0
3 something here 0 0
4 something not here 0 1
5 vla bla bla 0 0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.