[英]python replace string in a specific dataframe column
我想将数据框列中的任何字符串替换为字符串“Chaudière”,以任何以字符串“ chaud”开头的单词。 我希望分解每个“ Chaudiere”之后的名字和姓氏,以使NameDevice匿名
我的数据帧称为df1,列名称为NameDevice。
我已经试过了:
df1.loc[df['NameDevice'].str.startswith('chaud'), 'NameDevice'] = df1['NameDevice'].str.replace("chaud","Chaudière") . I check with df1.head(), it returns:
IdDevice IdDeviceType SerialDevice NameDevice IdLocation UuidAttributeDevice IdBox IsUpdateDevice
0 119 48 00001 Chaudière Maud Ferrand 4 NaN 4 0
1 120 48 00002 Chaudière Yvan Martinod 6 NaN 6 0
2 121 48 00006 Chaudière Anne-Sophie Premereur 7 NaN 7 0
3 122 48 00005 Chaudière Denis Fauser 8 NaN 8 0
4 123 48 00004 Chaudière Elariak Djilali 3 NaN 3 0
您可以通过首先调用str.lower
进行匹配,然后可以使用str.startswith
,然后仅在空格上split
并获取第一个条目以对数据进行匿名处理:
In [14]:
df.loc[df['NameDevice'].str.lower().str.startswith('chaud'), 'NameDevice'] = df['NameDevice'].str.split().str[0]
df
Out[14]:
IdDevice IdDeviceType SerialDevice NameDevice IdLocation \
0 119 48 1 Chaudière 4
1 120 48 2 Chaudière 6
2 121 48 6 Chaudière 7
3 122 48 5 Chaudière 8
4 123 48 4 Chaudière 3
UuidAttributeDevice IdBox IsUpdateDevice
0 NaN 4 0
1 NaN 6 0
2 NaN 7 0
3 NaN 8 0
4 NaN 3 0
另一种方法是使用str.extract
因此只需要Chaud...
:
In [27]:
df.loc[df['NameDevice'].str.lower().str.startswith('chaud'), 'NameDevice'] = df['NameDevice'].str.extract('(Chaud\w+ )', expand=False)
df
Out[27]:
IdDevice IdDeviceType SerialDevice NameDevice IdLocation \
0 119 48 1 Chaudière 4
1 120 48 2 Chaudière 6
2 121 48 6 Chaudière 7
3 122 48 5 Chaudière 8
4 123 48 4 Chaudière 3
UuidAttributeDevice IdBox IsUpdateDevice
0 NaN 4 0
1 NaN 6 0
2 NaN 7 0
3 NaN 8 0
4 NaN 3 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.