[英]Create a new column with value in another in pandas
嘿,我有一个 dataframe 例如:
Groups Names COLs COLe
G1 ABC_DEF.1:2-300():Canis_lupus 2 300
G1 SDDD1 NA NA
G1 SKUD.2. NA NA
G1 SEQUENCE3 NA NA
G1 ABC_DEF.1:400-600():Canis_lupus 400 600
G1 IJK_LMN.1:20-200():Bos_taurus 20 200
G2 OP_D:500-1000():Felis_catus 500 1000
G2 JDJDJ99 NA NA
并且我想添加一个新列Names2
并将其内容中没有()
的所有Names
与其内容中包含()
的每个Names
放在组中:
output 将是:
Groups Names Names2 COLs COLe
G1 ABC_DEF.1:2-300():Canis_lupus SDDD1 2 300
G1 ABC_DEF.1:2-300():Canis_lupus SKUD.2. 2 300
G1 ABC_DEF.1:2-300():Canis_lupus SEQUENCE3 2 300
G1 ABC_DEF.1:400-600():Canis_lupus SDDD1 400 600
G1 ABC_DEF.1:400-600():Canis_lupus SKUD.2. 400 600
G1 ABC_DEF.1:400-600():Canis_lupus SEQUENCE3 400 600
G1 IJK_LMN.1:20-200():Bos_taurus SDDD1 20 200
G1 IJK_LMN.1:20-200():Bos_taurus SKUD.2. 20 200
G1 IJK_LMN.1:20-200():Bos_taurus SEQUENCE3 20 200
G2 OP_D:500-1000():Felis_catus JDJDJ99 500 1000
有人有使用 pandas 的想法吗?
df1 = df[df.Names.str.contains('()', regex=False)]
df2 = df[~df.Names.str.contains('()', regex=False)][['Groups', 'Names']]
print( pd.merge(left=df1, right=df2, on='Groups').rename(columns={"Names_x": "Names", "Names_y": "Names2"}) )
印刷:
Groups Names COLs COLe Names2
0 G1 ABC_DEF.1:2-300():Canis_lupus 2.0 300.0 SDDD1
1 G1 ABC_DEF.1:2-300():Canis_lupus 2.0 300.0 SKUD.2.
2 G1 ABC_DEF.1:2-300():Canis_lupus 2.0 300.0 SEQUENCE3
3 G1 ABC_DEF.1:400-600():Canis_lupus 400.0 600.0 SDDD1
4 G1 ABC_DEF.1:400-600():Canis_lupus 400.0 600.0 SKUD.2.
5 G1 ABC_DEF.1:400-600():Canis_lupus 400.0 600.0 SEQUENCE3
6 G1 IJK_LMN.1:20-200():Bos_taurus 20.0 200.0 SDDD1
7 G1 IJK_LMN.1:20-200():Bos_taurus 20.0 200.0 SKUD.2.
8 G1 IJK_LMN.1:20-200():Bos_taurus 20.0 200.0 SEQUENCE3
9 G2 OP_D:500-1000():Felis_catus 500.0 1000.0 JDJDJ99
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.