繁体   English   中英

python替换特定数据框列中的字符串

[英]python replace string in a specific dataframe column

我想将数据框列中的任何字符串替换为字符串“Chaudière”,以任何以字符串“ chaud”开头的单词。 我希望分解每个“ Chaudiere”之后的名字和姓氏,以使NameDevice匿名

我的数据帧称为df1,列名称为NameDevice。

我已经试过了:

   df1.loc[df['NameDevice'].str.startswith('chaud'), 'NameDevice'] = df1['NameDevice'].str.replace("chaud","Chaudière") . I check with df1.head(), it returns:   

IdDevice    IdDeviceType    SerialDevice    NameDevice  IdLocation  UuidAttributeDevice IdBox   IsUpdateDevice
0            119    48       00001         Chaudière Maud Ferrand   4   NaN 4   0
1            120    48       00002         Chaudière Yvan Martinod  6   NaN 6   0
2            121    48       00006         Chaudière Anne-Sophie Premereur  7   NaN 7   0
3            122    48       00005         Chaudière Denis Fauser   8   NaN 8   0
4            123    48       00004         Chaudière Elariak Djilali    3   NaN 3   0

您可以通过首先调用str.lower进行匹配,然后可以使用str.startswith ,然后仅在空格上split并获取第一个条目以对数据进行匿名处理:

In [14]:
df.loc[df['NameDevice'].str.lower().str.startswith('chaud'), 'NameDevice'] = df['NameDevice'].str.split().str[0]
df

Out[14]:
   IdDevice  IdDeviceType  SerialDevice NameDevice  IdLocation  \
0       119            48             1  Chaudière           4   
1       120            48             2  Chaudière           6   
2       121            48             6  Chaudière           7   
3       122            48             5  Chaudière           8   
4       123            48             4  Chaudière           3   

   UuidAttributeDevice  IdBox  IsUpdateDevice  
0                  NaN      4               0  
1                  NaN      6               0  
2                  NaN      7               0  
3                  NaN      8               0  
4                  NaN      3               0  

另一种方法是使用str.extract因此只需要Chaud...

In [27]:
df.loc[df['NameDevice'].str.lower().str.startswith('chaud'), 'NameDevice'] = df['NameDevice'].str.extract('(Chaud\w+ )', expand=False)
df

Out[27]:
   IdDevice  IdDeviceType  SerialDevice  NameDevice  IdLocation  \
0       119            48             1  Chaudière            4   
1       120            48             2  Chaudière            6   
2       121            48             6  Chaudière            7   
3       122            48             5  Chaudière            8   
4       123            48             4  Chaudière            3   

   UuidAttributeDevice  IdBox  IsUpdateDevice  
0                  NaN      4               0  
1                  NaN      6               0  
2                  NaN      7               0  
3                  NaN      8               0  
4                  NaN      3               0  

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM