[英]Counting Character Occurrences for Each Pandas Dataframe Record
I have a data frame with a row that looks like the following:我有一个数据框,其中一行如下所示:
Section Title ...
==========================================
4.1.1 4.1.1 Requirements allocation. ...
4.1.2 4.1.2 Safety. ...
4.1.3 4.1.3 Warnings. ...
I am trying to count the number of periods (.) in the Section column, so I wrote this line:我正在尝试计算 Section 列中的句点 (.) 的数量,所以我写了这一行:
df['Subsections'] = df.Section.str.count(".")
However, the subsections column is returning the number 5 rather than the number I would expect for the first record which is 2 since there are two periods (.).但是,subsections 列返回的数字是 5,而不是我期望的第一条记录的数字,即 2,因为有两个句点 (.)。 Is there some little nuance I am missing here?
我在这里缺少一些细微差别吗?
By design Series.str.count(pat, flags=0)
interpret pat
parameter as a regular expression pattern(See the source code ).按照设计
Series.str.count(pat, flags=0)
将pat
参数解释为正则表达式模式(参见源代码)。 So you need to explicitly escape the .
所以你需要明确地转义
.
character using \
to literally match with .
使用
\
与 . 字面匹配的字符.
>>> df.Section.str.count("\.")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.