计算每个 Pandas Dataframe 记录的字符出现次数

Question

I have a data frame with a row that looks like the following:我有一个数据框，其中一行如下所示：

Section Title                          ...
==========================================
4.1.1   4.1.1 Requirements allocation. ...
4.1.2   4.1.2 Safety.                  ...
4.1.3   4.1.3 Warnings.                ...

I am trying to count the number of periods (.) in the Section column, so I wrote this line:我正在尝试计算 Section 列中的句点 (.) 的数量，所以我写了这一行：

df['Subsections'] = df.Section.str.count(".")

However, the subsections column is returning the number 5 rather than the number I would expect for the first record which is 2 since there are two periods (.).但是，subsections 列返回的数字是 5，而不是我期望的第一条记录的数字，即 2，因为有两个句点 (.)。 Is there some little nuance I am missing here?我在这里缺少一些细微差别吗？

Answer 1

By design Series.str.count(pat, flags=0) interpret pat parameter as a regular expression pattern(See the source code ).按照设计Series.str.count(pat, flags=0)将pat参数解释为正则表达式模式（参见源代码）。 So you need to explicitly escape the .所以你需要明确地转义. character using \ to literally match with .使用\与 . 字面匹配的字符.

>>> df.Section.str.count("\.")

计算每个 Pandas Dataframe 记录的字符出现次数

问题描述

1 个解决方案

解决方案1
3 已采纳 2022-09-26 16:03:38

计算每个 Pandas Dataframe 记录的字符出现次数

问题描述

1 个解决方案

解决方案1 3 已采纳 2022-09-26 16:03:38

解决方案1
3 已采纳 2022-09-26 16:03:38