简体   繁体   English

如果长度不等于x,则Python将字符串替换为空

[英]Python replace string with empty if length not equal to x

I have the following dataframe: 我有以下数据帧:

df=pd.DataFrame({'ssn':[12345,54321,111,47895,222311],'Name':['john','mike','adam','doug','liz']})

The DataFrame contains a 'ssn' that is supposed to only contain 5 digits. DataFrame包含一个'ssn',它应该只包含5位数字。 I want to replace all the rows that contain less than or greater than 5 digits with blank spaces. 我想用空格替换包含小于或大于5位的所有行。

The desired output is as below: 所需的输出如下:

   Name   ssn
0  john   12345
1  mike   54321
2  adam   
3  doug   47895
4  liz    

I referred to the following post from SO replace string if length is less than x However, on using the same solution with following commands gives me an error: 如果长度小于x ,我从SO 替换字符串中引用以下帖子但是,使用相同的解决方案和以下命令时给出了一个错误:

mask = df['ssn'].str.len() == 5
df['ssn'] = df['ssn'].mask(mask, df['ssn'].str.replace(df['ssn'], ''))
Traceback (most recent call last): 
TypeError: 'Series' objects are mutable, thus they cannot be hashed

I would appreciate any suggestions. 我将不胜感激任何建议。

您也可以使用df.apply执行此df.applydf['ssn'] = df['ssn'].apply(lambda a: a if len(str(a))==5 else '')

Your column ssn contains numbers not string, that is why it is not working. 你的列ssn包含的数字不是字符串,这就是它无效的原因。 Try the following : 请尝试以下方法:

mask = df['ssn'].astype(str).str.len() != 5
df.loc[mask, 'ssn'] = ''

In [1] : print(df)
Out[1] :    Name    ssn
0  john  12345
1  mike  54321
2  adam       
3  doug  47895
4   liz      

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM