简体   繁体   English

如何识别和删除数据框中的所有特殊字符

[英]How to identify and remove all the special characters from the data frame

col1 col1
Ntwk Lane 0 cannot on high operational\n Ntwk Lane 0 不能在高位运行\n
TX_PWR ALARM. TX_PWR 警报。 TX_PWR also fluctuates over time (found Tx power dropped to -2dBm also raises TX_PWR_LO_ALRM TX_PWR 也会随着时间波动(发现 Tx 功率下降到 -2dBm 也会提高 TX_PWR_LO_ALRM
module report ASIC_PLL_REF_CLK_FREQ_ERR(20008=0x800000) and HOST_REF_PLL_2(20014=0x2)模块报告 ASIC_PLL_REF_CLK_FREQ_ERR(20008=0x800000) 和 HOST_REF_PLL_2(20014=0x2)

I want to remove all the special characters from the column how to do that.我想从列中删除所有特殊字符如何做到这一点。 I need only alphabets rest I need to remove我只需要字母 rest 我需要删除

You can use regular expression:您可以使用正则表达式:

import re
df['col2'] = df['col1'].apply(lambda x: re.compile('[^a-zA-Z]').sub('', x))

As suggested by @ 9769953正如@ 9769953所建议的那样

df['col2'] = df['col1'].str.replace('[^a-zA-Z]', '', regex=True)

is also a much cleaner approach.也是一种更清洁的方法。 Same performance, but cleaner.相同的性能,但更清洁。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM