[英]python - Replace first five characters in a column with asterisks
I have a column called SSN in a CSV file with values like this 我在CSV文件中有一个名为SSN的列,其值像这样
289-31-9165
I need to loop through the values in this column and replace the first five characters so it looks like this 我需要遍历此列中的值并替换前五个字符,因此它看起来像这样
***-**-9165
Here's the code I have so far: 这是我到目前为止的代码:
emp_file = "Resources/employee_data1.csv"
emp_pd = pd.read_csv(emp_file)
new_ssn = emp_pd["SSN"].str.replace([:5], "*")
emp_pd["SSN"] = new_ssn
How do I loop through the value and replace just the first five numbers (only) with asterisks and keep the hiphens as is? 如何遍历值并将星号中的前五个数字(仅)替换为星号并保持原样?
与Me先生类似,这会删除前6个字符之前的所有内容,并将其替换为您的新格式。
emp_pd["SSN"] = emp_pd["SSN"].apply(lambda x: "***-**" + x[6:])
You can simply achieve this with replace() method: 您可以使用replace()方法简单地实现此目的:
borrows from @AkshayNevrekar.. 从@AkshayNevrekar借来的。
>>> df
ssn
0 111-22-3333
1 121-22-1123
2 345-87-3425
>>> df.replace(r'^\d{3}-\d{2}', "***-**", regex=True)
ssn
0 ***-**-3333
1 ***-**-1123
2 ***-**-3425
OR 要么
>>> df.ssn.replace(r'^\d{3}-\d{2}', "***-**", regex=True)
0 ***-**-3333
1 ***-**-1123
2 ***-**-3425
Name: ssn, dtype: object
OR: 要么:
df['ssn'] = df['ssn'].str.replace(r'^\d{3}-\d{2}', "***-**", regex=True)
将您的星号放在前面,然后抓住最后4位数字。
new_ssn = '***-**-' + emp_pd["SSN"][-4:]
You can use regex
您可以使用
regex
df = pd.DataFrame({'ssn':['111-22-3333','121-22-1123','345-87-3425']})
def func(x):
return re.sub(r'\d{3}-\d{2}','***-**', x)
df['ssn'] = df['ssn'].apply(func)
print(df)
Output: 输出:
ssn
0 ***-**-3333
1 ***-**-1123
2 ***-**-3425
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.