系列正则表达式提取生成数据帧

Question

I am working through a regex task on Dataquest.我正在完成关于 Dataquest 的正则表达式任务。 The following code snippet runs correctly inside of the Dataquest IDE:以下代码片段在 Dataquest IDE 中正确运行：

titles = hn["title"]
pattern = r'\[(\w+)\]'
tag_matches = titles.str.extract(pattern)
tag_freq = tag_matches.value_counts()
print(tag_freq, '\n')

However, on my PC running pandas 0.25.3 this exact same code block yields an error:但是，在我运行 pandas 0.25.3 的 PC 上，这个完全相同的代码块会产生一个错误：

Traceback (most recent call last):
  File "C:/Users/Mark/PycharmProjects/main/main.py", line 63, in <module>
    tag_freq = tag_matches.value_counts()
  File "C:\Users\Mark\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pandas\core\generic.py", line 5179, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'value_counts'

Why is tag_matches coming back as a dataframe?为什么 tag_matches 作为数据帧返回？ I am running an extract against the series 'titles'.我正在运行针对“标题”系列的摘录。

Answer 1

From the docs: Pandas.Series.str.Extract来自文档： Pandas.Series.str.Extract

A pattern with one group will return a Series if expand=False.如果 expand=False，则具有一组的模式将返回一个系列。

    >>> s.str.extract(r'[ab](\d)', expand=False)
0      1
1      2
2    NaN
dtype: object

So perhaps you must be explicit and set expand=False to get a series object?因此，也许您必须明确并设置 expand=False 才能获得系列对象？

系列正则表达式提取生成数据帧

问题描述

1 个解决方案

解决方案1
0 2019-12-16 15:01:45

系列正则表达式提取生成数据帧

问题描述

1 个解决方案

解决方案1 0 2019-12-16 15:01:45

解决方案1
0 2019-12-16 15:01:45