[英]Count occurrences of stings in a row Pandas
I'm trying to count the number of instances of a certain sting in a row in a pandas dataframe.我试图在 pandas dataframe 中连续计算某个刺痛的实例数。
In the example here I utilized a lambda
function and pandas .count()
to try and count the number of times 'True' exists in each row.在此处的示例中,我使用lambda
function 和 pandas .count .count()
来尝试计算每行中存在“真”的次数。
Though instead of a count of 'True' it is just returning a boolean whether or not it exists in the row...虽然不是“真”计数,但它只是返回 boolean,无论它是否存在于行中......
#create dataframe
d = {'Period': [1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4],
'Result': ['True','None','False','True','False','True','False','True','True','False','False','True','False','True','False','False'],
'Result1': ['True','None','False','True','False','True','False','True','True','False','False','True','False','True','False','False'],
'Result2': ['True','None','False','True','False','True','False','True','True','False','False','True','False','True','False','False']}
df = pd.DataFrame(data=d)
#count instances of Trus or False in each row
df['Count'] = df.apply(lambda row: row.astype(str).str.count('True').any(), axis=1)
print(df)
The desired outcome is:期望的结果是:
Period Result Result1 Result2 Count
1 True True True 3
2 None None None 0
3 False False False 0
4 True True True 3
1 False False False 0
2 True True True 3
3 False False False 0
... ... ... ... ......
You can use np.where
:您可以使用np.where
:
df['count'] = np.where(df == 'True', 1, 0).sum(axis=1)
Regarding why your apply
returns a boolean: both any and all returns boolean, not numbers关于为什么您的apply
返回 boolean: any和all返回 boolean,而不是数字
Edit : You can include df.isin
for multiple conditions:编辑:您可以在多个条件下包含df.isin
:
df['count'] = np.where(df.isin(['True', 'False']), 1, 0).sum(axis=1)
Use eq
with sum
:将eq
与sum
一起使用:
df.eq("True").sum(axis=1)
Use apply
with lambda
function.与lambda
apply
一起使用。
df.apply(lambda x: x.eq("True").sum(), axis=1)
For more than 1 text matching try对于超过 1 个文本匹配尝试
df.iloc[:,1:].apply(lambda x: x.str.contains("True|False")).sum(axis=1)
Avoiding using the apply function, as it can be slow:避免使用 apply function,因为它可能很慢:
df[["Result", "Result1", "Result2"]].sum(axis=1).str.count("True")
This also will work for when you have strings that are like:当您有如下字符串时,这也适用:
"this sentence contains True" “这句话包含真”
Your lambda is not working correctly, try this:您的 lambda 工作不正常,试试这个:
import pandas as pd
#create dataframe
d = {'Period': [1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4],
'Result': ['True','None','False','True','False','True','False','True','True','False','False','True','False','True','False','False'],
'Result1': ['True','None','False','True','False','True','False','True','True','False','False','True','False','True','False','False'],
'Result2': ['True','None','False','True','False','True','False','True','True','False','False','True','False','True','False','False']}
df = pd.DataFrame(data=d)
#count instances of Trues or False in each row
df['Count'] = df.apply(lambda row: sum(row[1:4] == 'True') ,axis=1)
print(df)
# Output:
# >> Period Result Result1 Result2 Count
# >> 0 1 True True True 3
# >> 1 2 None None None 0
# >> 2 3 False False False 0
# >> 3 4 True True True 3
# >> 4 1 False False False 0
# >> 5 2 True True True 3
# >> 6 3 False False False 0
# >> 7 4 True True True 3
# >> 8 1 True True True 3
# >> 9 2 False False False 0
# >> 10 3 False False False 0
# >> 11 4 True True True 3
# >> 12 1 False False False 0
# >> 13 2 True True True 3
# >> 14 3 False False False 0
# >> 15 4 False False False 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.