一次替换熊猫数据框中的所有字符

Question

I have a mutliple columns with different name format. 我有一个具有不同名称格式的多重列。 For example: 例如：

df.columns = ['name_column 1 (type1), name-column_2-(type1),...]

I need to replace all characters (except underscore) with underscore. 我需要用下划线替换所有字符（下划线除外）。 But if there is '-(' , I need just one underscore '_', not two for each special character. 但是，如果有'-（'，我只需要一个下划线'_'，每个特殊字符都不需要两个。

Desired output: 所需的输出：

df.columns = ['name_column_1_type1, name_column_2_type1,...]

I have tried with 我尝试过

for element in df.columns:
    re.sub('[^A-Za-z0-9]+', '_', element)
    print element

But nothing happens, just like in a few other attempts. 但是什么也没有发生，就像其他尝试一样。

Thanks in advance 提前致谢

Answer 1

Use replace + strip : 使用replace + strip ：

df.columns = df.columns.str.replace('[^A-Za-z0-9]+', '_').str.strip('_')

Sample: 样品：

df = pd.DataFrame(columns=["'name_column 1 (type1)", 'name-column_2-((type1)'])
print (df.columns.tolist())
["'name_column 1 (type1)", 'name-column_2-((type1)']

df.columns =  df.columns.str.replace('[^A-Za-z0-9]+', '_').str.strip('_')
print (df)
Empty DataFrame
Columns: [name_column_1_type1, name_column_2_type1]
Index: []

print (df.columns.tolist())
['name_column_1_type1', 'name_column_2_type1']

Answer 2

尝试：

df.columns = [re.sub('[^A-z0-9]', '_', i).replace(" ", "_").replace("__", "_") for i in df.columns]

Answer 3

Nothing happens because the result of re.sub is not assigned to anything and is therefore lost. 什么都没有发生，因为re.sub的结果没有分配给任何东西，因此丢失了。 You could use a list comprehension and assign the result back to df.columns : 您可以使用列表df.columns并将结果分配回df.columns ：

df.columns = [re.sub('[^A-Za-z0-9]+', '_', element) for element in df.columns]
print df.columns

Still the regex pattern is wrong, but this should get you started. regex模式仍然是错误的，但这应该可以帮助您入门。

一次替换熊猫数据框中的所有字符

问题描述

3 个解决方案

解决方案1
3 已采纳 2018-02-01 09:51:58

解决方案2
1 2018-02-01 09:48:54

解决方案3
1 2018-02-01 09:52:34

一次替换熊猫数据框中的所有字符

问题描述

3 个解决方案

解决方案1 3 已采纳 2018-02-01 09:51:58

解决方案2 1 2018-02-01 09:48:54

解决方案3 1 2018-02-01 09:52:34

解决方案1
3 已采纳 2018-02-01 09:51:58

解决方案2
1 2018-02-01 09:48:54

解决方案3
1 2018-02-01 09:52:34