[英]python replace string in a specific dataframe column
I would like to replace any string in a dataframe column by the string 'Chaudière', for any word that starts with the string "chaud". 我想将数据框列中的任何字符串替换为字符串“Chaudière”,以任何以字符串“ chaud”开头的单词。 I would like the first and last name after each "Chaudiere" to disapper, to anonymize the NameDevice
我希望分解每个“ Chaudiere”之后的名字和姓氏,以使NameDevice匿名
My data frame is called df1 and the column name is NameDevice. 我的数据帧称为df1,列名称为NameDevice。
I have tried this: 我已经试过了:
df1.loc[df['NameDevice'].str.startswith('chaud'), 'NameDevice'] = df1['NameDevice'].str.replace("chaud","Chaudière") . I check with df1.head(), it returns:
IdDevice IdDeviceType SerialDevice NameDevice IdLocation UuidAttributeDevice IdBox IsUpdateDevice
0 119 48 00001 Chaudière Maud Ferrand 4 NaN 4 0
1 120 48 00002 Chaudière Yvan Martinod 6 NaN 6 0
2 121 48 00006 Chaudière Anne-Sophie Premereur 7 NaN 7 0
3 122 48 00005 Chaudière Denis Fauser 8 NaN 8 0
4 123 48 00004 Chaudière Elariak Djilali 3 NaN 3 0
You can do the matching by calling str.lower
first, then you can use str.startswith
, and then just split
on the spaces and take the first entry to anonymise the data: 您可以通过首先调用
str.lower
进行匹配,然后可以使用str.startswith
,然后仅在空格上split
并获取第一个条目以对数据进行匿名处理:
In [14]:
df.loc[df['NameDevice'].str.lower().str.startswith('chaud'), 'NameDevice'] = df['NameDevice'].str.split().str[0]
df
Out[14]:
IdDevice IdDeviceType SerialDevice NameDevice IdLocation \
0 119 48 1 Chaudière 4
1 120 48 2 Chaudière 6
2 121 48 6 Chaudière 7
3 122 48 5 Chaudière 8
4 123 48 4 Chaudière 3
UuidAttributeDevice IdBox IsUpdateDevice
0 NaN 4 0
1 NaN 6 0
2 NaN 7 0
3 NaN 8 0
4 NaN 3 0
Another method is to use str.extract
so it only takes Chaud...
: 另一种方法是使用
str.extract
因此只需要Chaud...
:
In [27]:
df.loc[df['NameDevice'].str.lower().str.startswith('chaud'), 'NameDevice'] = df['NameDevice'].str.extract('(Chaud\w+ )', expand=False)
df
Out[27]:
IdDevice IdDeviceType SerialDevice NameDevice IdLocation \
0 119 48 1 Chaudière 4
1 120 48 2 Chaudière 6
2 121 48 6 Chaudière 7
3 122 48 5 Chaudière 8
4 123 48 4 Chaudière 3
UuidAttributeDevice IdBox IsUpdateDevice
0 NaN 4 0
1 NaN 6 0
2 NaN 7 0
3 NaN 8 0
4 NaN 3 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.