用正则表达式和熊猫分隔字符串

Question

I have below content and I need to seperate third part as below with panda in python:我有以下内容，我需要将第三部分与 python 中的熊猫分开如下：

My string:我的字符串：

FA0003 -BL- FA0005-BL
FA0004-BL-FA0008-BL

My Expected:我的期望：

FA0005
FA0008

Imagine I have a string like this in a column named A, the regex of below string for retrieving FA0003 is as below, but i dont now how to retrieve FA0005?想象一下，我在名为 A 的列中有一个这样的字符串，用于检索 FA0003 的以下字符串的正则表达式如下，但我现在不知道如何检索 FA0005？

FA0003 -BL- FA0005-BL
df[A].str.extract(r'(\w+\s*)', expand=False)
FA0003

Answer 1

You can use您可以使用

^(?:[^-]*-){2}\s*([^-]+)

See the regex demo查看正则表达式演示

In Pandas, use it with your current code:在 Pandas 中，将它与您当前的代码一起使用：

df[A].str.extract(r'^(?:[^-]*-){2}\s*([^-]+)', expand=False)

Details细节

^ - start of string ^ - 字符串的开始
(?:[^-]*-){2} - two occurrences of any chars other than - and then a - (?:[^-]*-){2} - 两次出现除-之外的任何字符，然后是-
\\s* - zero or more whitespaces (this is used to trim the output) \\s* - 零个或多个空格（用于修剪输出）
([^-]+) - Capturing group 1 (the return value): one or more chars other than - . ([^-]+) - 捕获组 1（返回值）：除-之外的一个或多个字符。

用正则表达式和熊猫分隔字符串

问题描述

1 个解决方案

解决方案1
3 已采纳 2020-11-23 13:16:01

用正则表达式和熊猫分隔字符串

问题描述

1 个解决方案

解决方案1 3 已采纳 2020-11-23 13:16:01

解决方案1
3 已采纳 2020-11-23 13:16:01