[英]Is there a Python function for counting the number of strings in a cell and reporting these in a new dataframe?
Let's stay I have a grocery list with one column titled "Groceries".让我们留下我有一个杂货清单,其中有一列标题为“杂货”。 In each row there is a list of strings, for example.
例如,在每一行中都有一个字符串列表。
Groceries![]() |
---|
apples, bananas, oranges![]() |
apples, bananas, bananas, pears![]() |
oranges, pears, bananas![]() |
Is there a way to count each string and add a "tally" is a new dataframe or similar thing with the appropriately labeled item?有没有办法计算每个字符串并添加一个“计数”是一个新的 dataframe 或带有适当标签项目的类似东西? The dataframe would then look like:
dataframe 将如下所示:
apples![]() |
oranges![]() |
bananas![]() |
pears![]() |
---|---|---|---|
1 ![]() |
1 ![]() |
1 ![]() |
0 ![]() |
1 ![]() |
0 ![]() |
2 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
0 ![]() |
1 ![]() |
I can't find a function that will recognize strings and count them in the appropriate row/column with the string name.我找不到 function 可以识别字符串并使用字符串名称在适当的行/列中对它们进行计数。 I am also pretty new to Python and am not sure what would go into creating a function that would do this.
我对 Python 也很陌生,我不确定 go 会怎样创建一个 function 来做到这一点。
You can split
the string on commas, explode
to multiple rows, get_dummies
to transform to 0/1, and groupby.sum
to aggregate:您可以用逗号
split
字符串, get_dummies
explode
为 0/1, groupby.sum
聚合:
out = (pd
.get_dummies(df['Groceries'].str.split(',\s*').explode())
.groupby(level=0).sum()
)
Or similar with crosstab
:或与
crosstab
类似:
s = df['Groceries'].str.split(',\s*').explode()
out = pd.crosstab(s.index, s)
output: output:
apples bananas oranges pears
0 1 1 1 0
1 1 2 0 1
2 0 1 1 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.