如何仅删除数据框中一列的特殊字符？

Question

I am trying to clean my data frame but I just want to remove special characters for just one column.我正在尝试清理我的数据框，但我只想删除一列的特殊字符。 (Please refer the figure below) （请参考下图）

df 1 df 1

| A       |  B   | C    |
|---------|----––|––----|
| Ags(1)  |  5   |  4   |
| Cdmx(2) |  6   |  6   |
|Leon(4)  |  90  |  45  |
|

What I want to remove is just the numbers and special characters of the column A我要删除的只是A列的数字和特殊字符

This is what I tried:这是我尝试过的：

df = re.sub('[^A-Za-z0-9]+', '', df1["A"])
>> expected string or bytes-like object

Answer 1

我会尝试在想要的列上使用带有 apply 函数的 lambda。

df1["A"] = df1["A"].apply(lambda x: re.sub('[^A-Za-z0-9]+', '', x))

Answer 2

You can also use .str.extract() to keep the part you want (vs replace, which eliminates the part you don't want):您还可以使用.str.extract()保留您想要的部分（与替换相比，它消除了您不想要的部分）：

from io import StringIO
import pandas as pd

data = ''' A         B    C    
 Ags(1)    5     4   
 Cdmx(2)   6     6   
Leon(4)    90    45  
'''
df = pd.read_csv(StringIO(data), sep='\s\s+', engine='python')

df['A'] = df['A'].str.extract(r'(\w+)', expand=False)
print(df)

      A   B   C
0   Ags   5   4
1  Cdmx   6   6
2  Leon  90  45

如何仅删除数据框中一列的特殊字符？

问题描述

2 个解决方案

解决方案1
1 2020-08-26 14:27:52

解决方案2
1 2020-08-26 14:41:00

如何仅删除数据框中一列的特殊字符？

问题描述

2 个解决方案

解决方案1 1 2020-08-26 14:27:52

解决方案2 1 2020-08-26 14:41:00

解决方案1
1 2020-08-26 14:27:52

解决方案2
1 2020-08-26 14:41:00