我想从 csv 文件中提取字符串的某些部分

Question

I have vast number of columns containing this kind of data:我有大量包含此类数据的列：

DE-JP-202/2066/A2@qwier.cu/68
NL-LK-02206/2136/A1@ozmmfts.de/731
OM-PH-31303222/3671/Z1@jtqy.ml/524

I would like to extract string between '@' and '.'我想提取'@'和'.'之间的字符串and between '.'在“。”之间and '/' into two separete colums.和 '/' 分成两个单独的列。

Like:喜欢：

txt 1      txt 2
qwier       cu
ozmmft      de
jtqy        ml

Tried:试过：

x = dane.str.extract(r'@(?P<txt1>\d)\.(?P<txt2>[ab\d])/')

But doesn't work但不起作用

Answer 1

If you want to get 2 capturing groups, you could use 2 negated character classes .如果你想获得 2 个捕获组，你可以使用 2 negated character classes 。

In the first group match 1+ times any char except a dot [^.]+在第一组中，除点[^.]+之外的任何字符都匹配 1 次以上

In the second group match 1+ times any char except a forward slash [^/]+在第二组比赛中，除正斜杠[^/]+之外的任何字符 1 次以上

@(?P<txt1>[^.]+)\.(?P<txt2>[^/]+)/

Regex demo正则表达式演示

Answer 2

If the formatting of your strings all have only 1 @ and 1 .如果您的字符串格式都只有 1 @和 1 . . . You can do the following:您可以执行以下操作：

s = 'DE-JP-202/2066/A2@qwier.cu/68'

column1 = s.split('@')[1].split('.')[0]

column2 = s.split('@')[1].split('.')[1].split('/')[0]

我想从 csv 文件中提取字符串的某些部分

问题描述

2 个解决方案

解决方案1
3 已采纳 2019-11-05 19:49:25

解决方案2
0 2019-11-05 19:50:53

我想从 csv 文件中提取字符串的某些部分

问题描述

2 个解决方案

解决方案1 3 已采纳 2019-11-05 19:49:25

解决方案2 0 2019-11-05 19:50:53

解决方案1
3 已采纳 2019-11-05 19:49:25

解决方案2
0 2019-11-05 19:50:53