简体   繁体   English

在 pandas 中拆分 object dtype 列

[英]Splitting an object dtype column in pandas

My DF look like having multiple delimiters (, = ) and a combination of int and str.我的 DF 看起来像有多个定界符 (, =) 以及 int 和 str 的组合。

DF type is object ( not converting to string ) DF 类型为 object(未转换为字符串)

info in the cell of a column contains this info列单元格中的信息包含此信息

Network=115,MEID=115,Function=115,Area=1806

I want to split it using delimiter "=" to get the area info.我想使用分隔符“=”拆分它以获取区域信息。 Is there any way of doing this有没有办法做到这一点

My DF look like having multiple delimiters (, = ) and a combination of int and str.我的 DF 看起来有多个分隔符 (, = ) 以及 int 和 str 的组合。

DF type is object ( not converting to string ) DF 类型是 object (不转换为字符串)

info in the cell of a column contains this info列单元格中的信息包含此信息

Network=115,MEID=115,Function=115,Area=1806

I want to split it using delimiter "=" to get the area info.我想使用分隔符“=”拆分它以获取区域信息。 Is there any way of doing this有没有办法做到这一点

To be generic that the Area=xxxx can be anywhere in the cells, we can use str.extract() together with regex (regular expression), as follows:为了使Area=xxxx可以在单元格中的任何位置通用,我们可以将str.extract()与 regex(正则表达式)一起使用,如下所示:

df['Area'] = df['Col1'].str.extract(r'Area=(?P<Area>[^,=]*)')

Test Run测试运行

Test data construction:测试数据构建:

data = {'Col1': ['Network=115,MEID=115,Function=115,Area=1806', 'Network=120,MEID=116,Area=1820,Function=116']}
df = pd.DataFrame(data)

print(df)

                                          Col1
0  Network=115,MEID=115,Function=115,Area=1806
1  Network=120,MEID=116,Area=1820,Function=116

Run new code运行新代码

df['Area'] = df['Col1'].str.extract(r'Area=(?P<Area>[^,=]*)')

print(df)


                                          Col1  Area
0  Network=115,MEID=115,Function=115,Area=1806  1806
1  Network=120,MEID=116,Area=1820,Function=116  1820

Regex Explanation:正则表达式解释:

Area= to match the parameter Area= literally Area=来匹配参数Area=字面意思

(?P<Area> name the regex capturing group as Area (?P<Area>将正则表达式捕获组命名为Area

[^,=]* 0 or more occurrence(s) of character class [^,=] which matches characters not equals to , or = [^,=]* 0 次或多次出现字符 class [^,=]匹配不等于,=的字符

) end of named capturing group )命名捕获组的结尾

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM