简体   繁体   中英

Splitting an object dtype column in pandas

My DF look like having multiple delimiters (, = ) and a combination of int and str.

DF type is object ( not converting to string )

info in the cell of a column contains this info

Network=115,MEID=115,Function=115,Area=1806

I want to split it using delimiter "=" to get the area info. Is there any way of doing this

My DF look like having multiple delimiters (, = ) and a combination of int and str.

DF type is object ( not converting to string )

info in the cell of a column contains this info

Network=115,MEID=115,Function=115,Area=1806

I want to split it using delimiter "=" to get the area info. Is there any way of doing this

To be generic that the Area=xxxx can be anywhere in the cells, we can use str.extract() together with regex (regular expression), as follows:

df['Area'] = df['Col1'].str.extract(r'Area=(?P<Area>[^,=]*)')

Test Run

Test data construction:

data = {'Col1': ['Network=115,MEID=115,Function=115,Area=1806', 'Network=120,MEID=116,Area=1820,Function=116']}
df = pd.DataFrame(data)

print(df)

                                          Col1
0  Network=115,MEID=115,Function=115,Area=1806
1  Network=120,MEID=116,Area=1820,Function=116

Run new code

df['Area'] = df['Col1'].str.extract(r'Area=(?P<Area>[^,=]*)')

print(df)


                                          Col1  Area
0  Network=115,MEID=115,Function=115,Area=1806  1806
1  Network=120,MEID=116,Area=1820,Function=116  1820

Regex Explanation:

Area= to match the parameter Area= literally

(?P<Area> name the regex capturing group as Area

[^,=]* 0 or more occurrence(s) of character class [^,=] which matches characters not equals to , or =

) end of named capturing group

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM