我应该如何从数据框中删除特殊字符（空格除外）

Question

I am reading excel file (specific one sheet), it looks very much like this. 我正在读取excel文件（特定的一张纸），看起来非常像这样。 I would like to remove all the numbers, underscore and hyphens under 'Org' columns. 我想删除“组织”列下的所有数字，下划线和连字符。 Output under 'Org' should be ddc systems and so on. “组织”下的输出应为ddc systems ，依此类推。

  Name      Org
0   abc   14_ddc_-_systems
1   sdc   14_ddc_-_systems
2   csc   14_ddd_-_systems
3   rdc   23_kbf_org
4   rfc   23_kbf_org

I tried below to remove numbers but it's not working .. 我在下面尝试删除数字，但是它不起作用..

s = sheet1['Org'].head()
s = s.replace('\d+\s', '')

Any help will be appreciated.! 任何帮助将不胜感激。！

Answer 1

You can use str.replace with regex. 您可以将str.replace与正则表达式一起使用。

Ex: 例如：

import pandas as pd

df = pd.DataFrame({"Org": ["14_ddc_-_systems", "14_ddc_-_systems", "23_kbf_org"]})
df["New"] = df["Org"].str.replace(r"[^a-zA-Z ]+", " ").str.strip()
print(df)

Output: 输出：

                Org          New
0  14_ddc_-_systems  ddc systems
1  14_ddc_-_systems  ddc systems
2        23_kbf_org      kbf org

我应该如何从数据框中删除特殊字符（空格除外）

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-08-14 04:57:58

我应该如何从数据框中删除特殊字符（空格除外）

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-08-14 04:57:58

解决方案1
2 已采纳 2018-08-14 04:57:58