简体   繁体   中英

Extract part of a column using regex or split in python

Hello I have a df such as

COL1   COL2
G1     QANH010008.1:18255-18820(-):Hab_ob
G1     QANH010002:7-10(-):Hab_ob

and I would like to create 2 new COL3 and COL4 where i put the number before the first - and after the first -

Here the ouptut should be

COL1   COL2                                COL3   COL4
G1     QANH010008.1:18255-18820(+):Hab_ob  18255  18820
G1     QANH010002:7-10(-):Hab_ob           7      10 

You can used named capturing groups for this then join to the original DataFrame. This answer incorporates a couple of suggestions from @MarkWang.

df.join(df['COL2'].str.extract(r'(?P<COL3>\d+)\-(?P<COL4>\d+)')) 

Output:

Out[206]: 
  COL1                                COL2   COL3   COL4
0   G1  QANH010008.1:18255-18820(-):Hab_ob  18255  18820
1   G1           QANH010002:7-10(-):Hab_ob      7     10

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM