[英]Extract dictionary value from column in data frame with Vaex
I applied on my dataframe the next command我在我的 dataframe 上应用了下一个命令
df['date_article'] = df.pagePath.str.extract_regex(pattern='(?P<digit>/\d{4}/\d{2}/\d{2}/)') df['date_article'] = df.pagePath.str.extract_regex(pattern='(?P<digit>/\d{4}/\d{2}/\d{2}/)')
And this created the column 'date_article'这创建了“date_article”列
pagePath![]() |
date_article ![]() |
---|---|
'/empresas/2021/10/22/tiendas-no-participan-buen' ![]() |
{'digit': '/2021/10/22/'} ![]() |
'/finanzas-personales/2021/10/22/pueden-cobrar-c ![]() |
{'digit': '/2021/10/22/'} ![]() |
Now I want to left only the date in 'date_article'.现在我只想在“date_article”中留下日期。
Expected output预期 output
pagePath![]() |
date_article ![]() |
---|---|
'/empresas/2021/10/22/tiendas-no-participan-buen' ![]() |
'/2021/10/22/' ![]() |
/finanzas-personales/2021/10/22/pueden-cobrar-c ![]() |
'/2021/10/22/' ![]() |
I tried many things but nothing seems to work我尝试了很多东西,但似乎没有任何效果
Thank you in advance for help预先感谢您的帮助
How about the following:以下情况如何:
df['date_article'] = df.apply(lambda x: x['digit'], axis=1)
It appears that extract_regex
returns a struct series:看来
extract_regex
返回一个结构系列:
Extract substrings defined by a regular expression using Apache Arrow (Google RE2 library).
使用 Apache Arrow(Google RE2 库)提取由正则表达式定义的子字符串。
Parameters
参数
pattern (str) – A regular expression which needs to contain named capture groups, eg 'letter' and 'digit' for the regular expression
'(?P[ab])(?Pd)'.
'(?P[ab])(?Pd)'。
Returns
退货
an expression containing a struct with field names corresponding to capture group identifiers.
So you will need to extract the field you want out of the struct.所以你需要从结构中提取你想要的字段。 I'm not a Vaex expert but maybe something like:
我不是 Vaex 专家,但可能类似于:
struct_series = df.pagePath.str.extract_regex(pattern='(?P<digit>/\d{4}/\d{2}/\d{2}/)')
df['date_article'] = struct_series.struct.get('digit')
Use:利用:
df = pd.DataFrame({'date_article':[{'digit': '/2021/10/22/'}]})
df['date_article'] = df['date_article'].apply(lambda x: x['digit'])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.