[英]How to sum up all values in a row where the column contains a specific string?
So I have a dataframe of categories of venues in each neighbourhood.所以我在每个社区都有一个dataframe 的场地类别。 It looks like:看起来像:
The values in each row represent the no.每行中的值代表编号。 of each venue in the specific neighbourhood.特定社区中的每个场所。
I want to find out the total number of restaurants in each neighbourhood.我想知道每个街区的餐馆总数。 To do so, I know I have to sum up the values in a row where the column contains the string "Restaurant".为此,我知道我必须对列中包含字符串“Restaurant”的行中的值求和。
I've tried using str.contains
function but that sums up True cases - how many times a column containing the string restaurant
has a value >0
in that row.我试过使用str.contains
function 但这总结了真实的情况 - 包含字符串restaurant
的列在该行中有多少次值>0
。 But instead, what I'd like is, to sum up, the total no.但相反,我想要的是,总而言之,总数。 of restaurants in the neighbourhood instead.取而代之的是附近的餐馆。
Here's a way to do that:这是一种方法:
df = pd.DataFrame({"restaurant_a": [1,2,3], "shop": [2,3,4], "restaurant_b": [4,5,6]})
df["sum_rest"] = df[[x for x in df.columns if "restaurant" in x]].sum(axis = "columns")
df
The result is:结果是:
restaurant_a shop restaurant_b sum_rest
0 1 2 4 5
1 2 3 5 7
2 3 4 6 9
You can use pd.Index.str.contains
with df.loc
here.您可以在此处将pd.Index.str.contains
与df.loc
一起使用。
df['sum_rest'] = df.loc[:,df.columns.str.contains('Restaurant')].sum(axis=1)
Define a list of columns containing "Restaurant":定义一个包含 "Restaurant" 的列列表:
lr = ["Afgan Restaurant", "American Restaurant", "Argentinian Restaurant"]
Then parse the result and put it in a column:然后解析结果并将其放入列中:
df["sum_restaurant"] = df.loc[:, columns=lr].apply(lambda row : np.sum(row.to_numpy()))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.