重命名 Pandas 中的列名

Question

我想从

['$a', '$b', '$c', '$d', '$e']

至

['a', 'b', 'c', 'd', 'e']

Answer 1

重命名特定列

使用df.rename()函数并引用要重命名的列。 并非所有列都必须重命名：

df = df.rename(columns={'oldName1': 'newName1', 'oldName2': 'newName2'})
# Or rename the existing DataFrame (rather than creating a copy) 
df.rename(columns={'oldName1': 'newName1', 'oldName2': 'newName2'}, inplace=True)

最小代码示例

df = pd.DataFrame('x', index=range(3), columns=list('abcde'))
df

   a  b  c  d  e
0  x  x  x  x  x
1  x  x  x  x  x
2  x  x  x  x  x

以下方法都有效并产生相同的输出：

df2 = df.rename({'a': 'X', 'b': 'Y'}, axis=1)  # new method
df2 = df.rename({'a': 'X', 'b': 'Y'}, axis='columns')
df2 = df.rename(columns={'a': 'X', 'b': 'Y'})  # old method  

df2

   X  Y  c  d  e
0  x  x  x  x  x
1  x  x  x  x  x
2  x  x  x  x  x

请记住将结果分配回去，因为修改不是就地的。 或者，指定inplace=True ：

df.rename({'a': 'X', 'b': 'Y'}, axis=1, inplace=True)
df

   X  Y  c  d  e
0  x  x  x  x  x
1  x  x  x  x  x
2  x  x  x  x  x

从 v0.25 开始，如果指定了要重命名的无效列，您还可以指定errors='raise'来引发错误。 请参阅v0.25 rename()文档。

重新分配列标题

将df.set_axis()与axis=1和inplace=False一起使用（返回副本）。

df2 = df.set_axis(['V', 'W', 'X', 'Y', 'Z'], axis=1, inplace=False)
df2

   V  W  X  Y  Z
0  x  x  x  x  x
1  x  x  x  x  x
2  x  x  x  x  x

这将返回一个副本，但您可以通过设置 inplace inplace=True修改 DataFrame（这是版本 <=0.24 的默认行为，但将来可能会更改）。

您也可以直接分配标题：

df.columns = ['V', 'W', 'X', 'Y', 'Z']
df

   V  W  X  Y  Z
0  x  x  x  x  x
1  x  x  x  x  x
2  x  x  x  x  x

Answer 2

只需将其分配给.columns属性：

>>> df = pd.DataFrame({'$a':[1,2], '$b': [10,20]})
>>> df
   $a  $b
0   1  10
1   2  20

>>> df.columns = ['a', 'b']
>>> df
   a   b
0  1  10
1  2  20

Answer 3

rename方法可以带一个函数，例如：

In [11]: df.columns
Out[11]: Index([u'$a', u'$b', u'$c', u'$d', u'$e'], dtype=object)

In [12]: df.rename(columns=lambda x: x[1:], inplace=True)

In [13]: df.columns
Out[13]: Index([u'a', u'b', u'c', u'd', u'e'], dtype=object)

Answer 4

如使用文本数据中所述：

df.columns = df.columns.str.replace('$', '')

Answer 5

熊猫 0.21+ 答案

0.21 版中对列重命名进行了一些重大更新。

rename方法添加了axis参数，可以设置为columns或1 。 此更新使此方法与 pandas API 的其余部分相匹配。 它仍然具有index和columns参数，但您不再被迫使用它们。
将inplace设置为False的set_axis方法使您能够使用列表重命名所有索引或列标签。

Pandas 0.21+ 的示例

构建示例 DataFrame：

df = pd.DataFrame({'$a':[1,2], '$b': [3,4], 
                   '$c':[5,6], '$d':[7,8], 
                   '$e':[9,10]})

   $a  $b  $c  $d  $e
0   1   3   5   7   9
1   2   4   6   8  10

将`rename`与`axis='columns'`或`axis=1`一起使用

df.rename({'$a':'a', '$b':'b', '$c':'c', '$d':'d', '$e':'e'}, axis='columns')

或者

df.rename({'$a':'a', '$b':'b', '$c':'c', '$d':'d', '$e':'e'}, axis=1)

两者都导致以下结果：

   a  b  c  d   e
0  1  3  5  7   9
1  2  4  6  8  10

仍然可以使用旧的方法签名：

df.rename(columns={'$a':'a', '$b':'b', '$c':'c', '$d':'d', '$e':'e'})

rename函数还接受将应用于每个列名的函数。

df.rename(lambda x: x[1:], axis='columns')

或者

df.rename(lambda x: x[1:], axis=1)

将`set_axis`与列表和 inplace `inplace=False`一起使用

您可以为set_axis方法提供一个长度等于列数（或索引）的列表。 目前， inplace默认为True ，但在未来的版本中， inplace将默认为False 。

df.set_axis(['a', 'b', 'c', 'd', 'e'], axis='columns', inplace=False)

或者

df.set_axis(['a', 'b', 'c', 'd', 'e'], axis=1, inplace=False)

为什么不使用`df.columns = ['a', 'b', 'c', 'd', 'e']` ？

像这样直接分配列并没有错。 这是一个非常好的解决方案。

使用set_axis的优点是它可以用作方法链的一部分，并且它返回 DataFrame 的新副本。 没有它，在重新分配列之前，您必须将链的中间步骤存储到另一个变量中。

# new for pandas 0.21+
df.some_method1()
  .some_method2()
  .set_axis()
  .some_method3()

# old way
df1 = df.some_method1()
        .some_method2()
df1.columns = columns
df1.some_method3()

Answer 6

由于您只想删除所有列名中的 $ 符号，您可以这样做：

df = df.rename(columns=lambda x: x.replace('$', ''))

或者

df.rename(columns=lambda x: x.replace('$', ''), inplace=True)

Answer 7

在 Pandas 中重命名列是一项简单的任务。

df.rename(columns={'$a': 'a', '$b': 'b', '$c': 'c', '$d': 'd', '$e': 'e'}, inplace=True)

Answer 8

df.columns = ['a', 'b', 'c', 'd', 'e']

它将按照您提供的顺序将现有名称替换为您提供的名称。

Answer 9

利用：

old_names = ['$a', '$b', '$c', '$d', '$e'] 
new_names = ['a', 'b', 'c', 'd', 'e']
df.rename(columns=dict(zip(old_names, new_names)), inplace=True)

这样，您可以根据需要手动编辑new_names 。 当您只需要重命名几列以纠正拼写错误、重音符号、删除特殊字符等时，它非常有用。

Answer 10

列名与系列名称

我想解释一下幕后发生的事情。

数据框是一组系列。

系列又是numpy.array的扩展。

numpy.array有一个属性.name 。

这是该系列的名称。 Pandas 很少尊重此属性，但它在某些地方徘徊，可用于破解 Pandas 的某些行为。

命名列列表

这里的很多答案都谈到df.columns属性是一个list ，而实际上它是一个Series 。 这意味着它有一个.name属性。

如果您决定填写Series列的名称，就会发生这种情况：

df.columns = ['column_one', 'column_two']
df.columns.names = ['name of the list of columns']
df.index.names = ['name of the index']

name of the list of columns     column_one  column_two
name of the index
0                                    4           1
1                                    5           2
2                                    6           3

请注意，索引的名称总是低一列。

挥之不去的文物

.name属性有时会持续存在。 如果您设置df.columns = ['one', 'two']那么df.one.name将是'one' 。

如果你设置df.one.name = 'three'那么df.columns仍然会给你['one', 'two'] ，并且df.one.name会给你'three' 。

但

pd.DataFrame(df.one)将返回

因为 Pandas 重用了已经定义的Series的.name 。

多级列名

Pandas 可以使用多层列名。 没有太多的魔法，但我也想在我的回答中涵盖这一点，因为我没有看到有人在这里接受这个。

    |one            |
    |one      |two  |
0   |  4      |  1  |
1   |  5      |  2  |
2   |  6      |  3  |

这很容易通过将列设置为列表来实现，如下所示：

df.columns = [['one', 'one'], ['one', 'two']]

Answer 11

一条线或管道解决方案

我将专注于两件事：

OP明确指出

我将编辑后的列名存储在一个列表中，但我不知道如何替换列名。

我不想解决如何替换'$'或从每个列标题中删除第一个字符的问题。 OP 已经完成了这一步。 相反，我想专注于在给定替换列名称列表的情况下用新的columns对象替换现有的列对象。
df.columns = new其中new是新列名称的列表，这很简单。 这种方法的缺点是它需要编辑现有数据框的columns属性，并且不是内联完成的。 我将展示一些通过流水线执行此操作的方法，而无需编辑现有数据框。

设置 1
为了专注于用预先存在的列表重命名替换列名的需要，我将创建一个新的示例数据框df ，其中包含初始列名和不相关的新列名。

df = pd.DataFrame({'Jack': [1, 2], 'Mahesh': [3, 4], 'Xin': [5, 6]})
new = ['x098', 'y765', 'z432']

df

   Jack  Mahesh  Xin
0     1       3    5
1     2       4    6

解决方案 1
pd.DataFrame.rename

已经说过，如果您有一个将旧列名映射到新列名的字典，则可以使用pd.DataFrame.rename 。

d = {'Jack': 'x098', 'Mahesh': 'y765', 'Xin': 'z432'}
df.rename(columns=d)

   x098  y765  z432
0     1     3     5
1     2     4     6

但是，您可以轻松地创建该字典并将其包含在对rename的调用中。 下面利用了这样一个事实，即在迭代df时，我们迭代每个列名。

# Given just a list of new column names
df.rename(columns=dict(zip(df, new)))

   x098  y765  z432
0     1     3     5
1     2     4     6

如果您的原始列名是唯一的，这将非常有用。 但如果他们不是，那么这就会崩溃。

设置 2
非唯一列

df = pd.DataFrame(
    [[1, 3, 5], [2, 4, 6]],
    columns=['Mahesh', 'Mahesh', 'Xin']
)
new = ['x098', 'y765', 'z432']

df

   Mahesh  Mahesh  Xin
0       1       3    5
1       2       4    6

解决方案 2
pd.concat使用keys参数

首先，注意当我们尝试使用解决方案 1 时会发生什么：

df.rename(columns=dict(zip(df, new)))

   y765  y765  z432
0     1     3     5
1     2     4     6

我们没有将new列表映射为列名。 我们最终重复了y765 。 相反，我们可以在遍历df的列时使用pd.concat函数的keys参数。

pd.concat([c for _, c in df.items()], axis=1, keys=new) 

   x098  y765  z432
0     1     3     5
1     2     4     6

解决方案 3
重建。 仅当所有列都有一个dtype时才应使用此选项。 否则，您最终会得到所有列的dtype object ，并且将它们转换回来需要更多的字典工作。

单一dtype

pd.DataFrame(df.values, df.index, new)

   x098  y765  z432
0     1     3     5
1     2     4     6

混合dtype

pd.DataFrame(df.values, df.index, new).astype(dict(zip(new, df.dtypes)))

   x098  y765  z432
0     1     3     5
1     2     4     6

解决方案 4
这是transpose和set_index的噱头。 pd.DataFrame.set_index允许我们内联设置索引，但没有对应set_columns 。 所以我们可以转置，然后set_index ，然后转回。 但是，解决方案 3 中相同的单一dtype与混合dtype警告在这里适用。

单一dtype

df.T.set_index(np.asarray(new)).T

   x098  y765  z432
0     1     3     5
1     2     4     6

混合dtype

df.T.set_index(np.asarray(new)).T.astype(dict(zip(new, df.dtypes)))

   x098  y765  z432
0     1     3     5
1     2     4     6

解决方案 5
在pd.DataFrame.rename中使用lambda循环遍历new的每个元素。
在这个解决方案中，我们传递了一个接受x但随后忽略它的 lambda。 它也需要一个y但并不期望它。 相反，将迭代器作为默认值给出，然后我可以使用它一次循环遍历一个，而无需考虑x的值是什么。

df.rename(columns=lambda x, y=iter(new): next(y))

   x098  y765  z432
0     1     3     5
1     2     4     6

正如sopython chat中的人们向我指出的那样，如果我在x和y之间添加一个* ，我可以保护我的y变量。 不过，在这种情况下，我认为它不需要保护。 仍然值得一提。

df.rename(columns=lambda x, *, y=iter(new): next(y))

   x098  y765  z432
0     1     3     5
1     2     4     6

Answer 12

让我们通过一个小例子来理解重命名......

使用映射重命名列：

 df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6]}) # Creating a df with column name A and B df.rename({"A": "new_a", "B": "new_b"}, axis='columns', inplace =True) # Renaming column A with 'new_a' and B with 'new_b' Output: new_a new_b 0 1 4 1 2 5 2 3 6

使用映射重命名 index/Row_Name：

 df.rename({0: "x", 1: "y", 2: "z"}, axis='index', inplace =True) # Row name are getting replaced by 'x', 'y', and 'z'. Output: new_a new_b x 1 4 y 2 5 z 3 6

Answer 13

假设您的数据集名称是 df，而 df 有。

df = ['$a', '$b', '$c', '$d', '$e']`

因此，要重命名这些，我们只需这样做。

df.columns = ['a','b','c','d','e']

Answer 14

假设这是您的数据框。

您可以使用两种方法重命名列。

使用dataframe.columns=[#list]
```
 df.columns=['a','b','c','d','e']
```
此方法的局限性在于，如果必须更改一列，则必须传递完整的列列表。 此外，此方法不适用于索引标签。 例如，如果你通过了这个：
```
 df.columns = ['a','b','c','d']
```
这将引发错误。 长度不匹配：预期轴有 5 个元素，新值有 4 个元素。
另一种方法是 Pandas rename()方法，用于重命名任何索引、列或行
```
df = df.rename(columns={'$a':'a'})
```

同样，您可以更改任何行或列。

Answer 15

许多 pandas 函数都有一个 inplace 参数。 将其设置为 True 时，转换直接应用于您调用它的数据框。 例如：

df = pd.DataFrame({'$a':[1,2], '$b': [3,4]})
df.rename(columns={'$a': 'a'}, inplace=True)
df.columns

>>> Index(['a', '$b'], dtype='object')

或者，在某些情况下，您希望保留原始数据框。 如果创建数据框是一项昂贵的任务，我经常看到人们陷入这种情况。 例如，如果创建数据框需要查询雪花数据库。 在这种情况下，只需确保将 inplace 参数设置为 False。

df = pd.DataFrame({'$a':[1,2], '$b': [3,4]})
df2 = df.rename(columns={'$a': 'a'}, inplace=False)
df.columns
    
>>> Index(['$a', '$b'], dtype='object')

df2.columns

>>> Index(['a', '$b'], dtype='object')

如果这些类型的转换是您经常做的事情，您还可以查看许多不同的 pandas GUI 工具。 我是一个叫做Mito的创造者。 它是一个电子表格，可自动将您的编辑转换为 python 代码。

Answer 16

df.rename(index=str, columns={'A':'a', 'B':'b'})

pandas.DataFrame.rename

Answer 17

如果您有数据框，则 df.columns 会将所有内容转储到您可以操作的列表中，然后将其作为列名重新分配到您的数据框中...

columns = df.columns
columns = [row.replace("$", "") for row in columns]
df.rename(columns=dict(zip(columns, things)), inplace=True)
df.head() # To validate the output

最好的办法？ 我不知道。 一种方式——是的。

评估问题答案中提出的所有主要技术的更好方法是使用 cProfile 来衡量内存和执行时间。 @kadee、@kaitlyn 和 @eumiro 具有执行时间最快的函数 - 尽管这些函数非常快，但我们正在比较所有答案的 0.000 和 0.001 秒的舍入。 道德：我上面的答案可能不是“最好”的方式。

import pandas as pd
import cProfile, pstats, re

old_names = ['$a', '$b', '$c', '$d', '$e']
new_names = ['a', 'b', 'c', 'd', 'e']
col_dict = {'$a': 'a', '$b': 'b', '$c': 'c', '$d': 'd', '$e': 'e'}

df = pd.DataFrame({'$a':[1, 2], '$b': [10, 20], '$c': ['bleep', 'blorp'], '$d': [1, 2], '$e': ['texa$', '']})

df.head()

def eumiro(df, nn):
    df.columns = nn
    # This direct renaming approach is duplicated in methodology in several other answers:
    return df

def lexual1(df):
    return df.rename(columns=col_dict)

def lexual2(df, col_dict):
    return df.rename(columns=col_dict, inplace=True)

def Panda_Master_Hayden(df):
    return df.rename(columns=lambda x: x[1:], inplace=True)

def paulo1(df):
    return df.rename(columns=lambda x: x.replace('$', ''))

def paulo2(df):
    return df.rename(columns=lambda x: x.replace('$', ''), inplace=True)

def migloo(df, on, nn):
    return df.rename(columns=dict(zip(on, nn)), inplace=True)

def kadee(df):
    return df.columns.str.replace('$', '')

def awo(df):
    columns = df.columns
    columns = [row.replace("$", "") for row in columns]
    return df.rename(columns=dict(zip(columns, '')), inplace=True)

def kaitlyn(df):
    df.columns = [col.strip('$') for col in df.columns]
    return df

print 'eumiro'
cProfile.run('eumiro(df, new_names)')
print 'lexual1'
cProfile.run('lexual1(df)')
print 'lexual2'
cProfile.run('lexual2(df, col_dict)')
print 'andy hayden'
cProfile.run('Panda_Master_Hayden(df)')
print 'paulo1'
cProfile.run('paulo1(df)')
print 'paulo2'
cProfile.run('paulo2(df)')
print 'migloo'
cProfile.run('migloo(df, old_names, new_names)')
print 'kadee'
cProfile.run('kadee(df)')
print 'awo'
cProfile.run('awo(df)')
print 'kaitlyn'
cProfile.run('kaitlyn(df)')

Answer 18

df = pd.DataFrame({'$a': [1], '$b': [1], '$c': [1], '$d': [1], '$e': [1]})

如果您的新列列表与现有列的顺序相同，则分配很简单：

new_cols = ['a', 'b', 'c', 'd', 'e']
df.columns = new_cols
>>> df
   a  b  c  d  e
0  1  1  1  1  1

如果您有一个将旧列名键入新列名的字典，则可以执行以下操作：

d = {'$a': 'a', '$b': 'b', '$c': 'c', '$d': 'd', '$e': 'e'}
df.columns = df.columns.map(lambda col: d[col])  # Or `.map(d.get)` as pointed out by @PiRSquared.
>>> df
   a  b  c  d  e
0  1  1  1  1  1

如果您没有列表或字典映射，则可以通过列表推导去除前导$符号：

df.columns = [col[1:] if col[0] == '$' else col for col in df]

Answer 19

我们可以替换原始列标签的另一种方法是从原始列标签中删除不需要的字符（此处为“$”）。

这可以通过在 df.columns 上运行for循环并将剥离的列附加到 df.columns 来完成。

相反，我们可以使用下面的列表推导在单个语句中巧妙地做到这一点：

df.columns = [col.strip('$') for col in df.columns]

（ Python 中的strip方法从字符串的开头和结尾剥离给定的字符。）

Answer 20

这真的很简单。 只需使用：

df.columns = ['Name1', 'Name2', 'Name3'...]

它将按照您输入的顺序分配列名。

Answer 21

如果您已经有了新列名的列表，可以试试这个：

new_cols = ['a', 'b', 'c', 'd', 'e']
new_names_map = {df.columns[i]:new_cols[i] for i in range(len(new_cols))}

df.rename(new_names_map, axis=1, inplace=True)

Answer 22

# This way it will work
import pandas as pd

# Define a dictionary 
rankings = {'test': ['a'],
        'odi': ['E'],
        't20': ['P']}

# Convert the dictionary into DataFrame
rankings_pd = pd.DataFrame(rankings)

# Before renaming the columns
print(rankings_pd)

rankings_pd.rename(columns = {'test':'TEST'}, inplace = True)

Answer 23

你可以使用str.slice ：

df.columns = df.columns.str.slice(1)

Answer 24

另一种选择是使用正则表达式重命名：

import pandas as pd
import re

df = pd.DataFrame({'$a':[1,2], '$b':[3,4], '$c':[5,6]})

df = df.rename(columns=lambda x: re.sub('\$','',x))
>>> df
   a  b  c
0  1  3  5
1  2  4  6

Answer 25

我的方法是通用的，您可以通过逗号分隔delimiters= variable 添加其他分隔符并使其面向未来。

工作代码：

import pandas as pd
import re


df = pd.DataFrame({'$a':[1,2], '$b': [3,4],'$c':[5,6], '$d': [7,8], '$e': [9,10]})

delimiters = '$'
matchPattern = '|'.join(map(re.escape, delimiters))
df.columns = [re.split(matchPattern, i)[1] for i in df.columns ]

输出：

>>> df
   $a  $b  $c  $d  $e
0   1   3   5   7   9
1   2   4   6   8  10

>>> df
   a  b  c  d   e
0  1  3  5  7   9
1  2  4  6  8  10

Answer 26

请注意，先前答案中的方法不适用于MultiIndex 。 对于MultiIndex ，您需要执行以下操作：

>>> df = pd.DataFrame({('$a','$x'):[1,2], ('$b','$y'): [3,4], ('e','f'):[5,6]})
>>> df
   $a $b  e
   $x $y  f
0  1  3  5
1  2  4  6
>>> rename = {('$a','$x'):('a','x'), ('$b','$y'):('b','y')}
>>> df.columns = pandas.MultiIndex.from_tuples([
        rename.get(item, item) for item in df.columns.tolist()])
>>> df
   a  b  e
   x  y  f
0  1  3  5
1  2  4  6

Answer 27

如果您必须处理您无法控制的由提供系统命名的大量列，我想出了以下方法，它是一种通用方法和特定替换的组合。

首先使用正则表达式从数据框列名创建一个字典，以便丢弃列名的某些附录，然后将特定替换添加到字典中，以便稍后在接收数据库中按预期命名核心列。

然后将其一次性应用于数据帧。

dict = dict(zip(df.columns, df.columns.str.replace('(:S$|:C1$|:L$|:D$|\.Serial:L$)', '')))
dict['brand_timeseries:C1'] = 'BTS'
dict['respid:L'] = 'RespID'
dict['country:C1'] = 'CountryID'
dict['pim1:D'] = 'pim_actual'
df.rename(columns=dict, inplace=True)

Answer 28

如果您只想删除“$”符号，请使用以下代码

df.columns = pd.Series(df.columns.str.replace("$", ""))

Answer 29

除了已经提供的解决方案之外，您还可以在读取文件时替换所有列。 我们可以使用names和header=0来做到这一点。

首先，我们创建一个我们喜欢用作列名的名称列表：

import pandas as pd

ufo_cols = ['city', 'color reported', 'shape reported', 'state', 'time']
ufo.columns = ufo_cols

ufo = pd.read_csv('link to the file you are using', names = ufo_cols, header = 0)

在这种情况下，所有列名都将替换为您在列表中的名称。

Answer 30

这是我喜欢用来减少打字的一个漂亮的小功能：

def rename(data, oldnames, newname):
    if type(oldnames) == str: # Input can be a string or list of strings
        oldnames = [oldnames] # When renaming multiple columns
        newname = [newname] # Make sure you pass the corresponding list of new names
    i = 0
    for name in oldnames:
        oldvar = [c for c in data.columns if name in c]
        if len(oldvar) == 0:
            raise ValueError("Sorry, couldn't find that column in the dataset")
        if len(oldvar) > 1: # Doesn't have to be an exact match
            print("Found multiple columns that matched " + str(name) + ": ")
            for c in oldvar:
                print(str(oldvar.index(c)) + ": " + str(c))
            ind = input('Please enter the index of the column you would like to rename: ')
            oldvar = oldvar[int(ind)]
        if len(oldvar) == 1:
            oldvar = oldvar[0]
        data = data.rename(columns = {oldvar : newname[i]})
        i += 1
    return data

这是它如何工作的示例：

In [2]: df = pd.DataFrame(np.random.randint(0, 10, size=(10, 4)), columns = ['col1', 'col2', 'omg', 'idk'])
# First list = existing variables
# Second list = new names for those variables
In [3]: df = rename(df, ['col', 'omg'],['first', 'ohmy'])
Found multiple columns that matched col:
0: col1
1: col2

Please enter the index of the column you would like to rename: 0

In [4]: df.columns
Out[5]: Index(['first', 'col2', 'ohmy', 'idk'], dtype='object')

Answer 31

假设您可以使用正则表达式，此解决方案无需使用正则表达式进行手动编码：

import pandas as pd
import re

srch = re.compile(r"\w+")

data = pd.read_csv("CSV_FILE.csv")
cols = data.columns
new_cols = list(map(lambda v:v.group(), (list(map(srch.search, cols)))))
data.columns = new_cols

Answer 32

我需要重命名 XGBoost 的功能，但它不喜欢以下任何一个：

import re
regex = r"[!\"#$%&'()*+,\-.\/:;<=>?@[\\\]^_`{|}~ ]+"
X_trn.columns = X_trn.columns.str.replace(regex, '_', regex=True)
X_tst.columns = X_tst.columns.str.replace(regex, '_', regex=True)

Answer 33

您可以使用带有索引的lstrip或strip方法：

df.columns = df.columns.str.lstrip('$')

或者

cols = ['$a', '$b', '$c', '$d', '$e']
pd.Series(cols).str.lstrip('$').tolist()

输出：

['a', 'b', 'c', 'd', 'e']

Answer 34

我的单行答案是df.columns = df_new_cols是处理时间为 1/3 的最佳答案。

timeit比较：df 有 7 列。 我正在尝试更改一些名称。

%timeit df.rename(columns={old_col:new_col for (old_col,new_col) in zip(df_old_cols,df_new_cols)},inplace=True)
214 µs ± 10.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit df.rename(columns=dict(zip(df_old_cols,df_new_cols)),inplace=True)
212 µs ± 7.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit df.columns = df_new_cols
72.9 µs ± 17.2 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

重命名 Pandas 中的列名

问题描述

34 个解决方案

解决方案1 3943 2012-07-06 01:48:15

重命名特定列

重新分配列标题

解决方案2 2432 已采纳 2012-07-05 14:23:27

解决方案3 482 2013-05-21 09:58:59

解决方案4 244 2015-05-30 13:24:05

解决方案5 188 2017-10-24 13:39:15

熊猫 0.21+ 答案

Pandas 0.21+ 的示例

将rename与axis='columns'或axis=1一起使用

将set_axis与列表和 inplace inplace=False一起使用

为什么不使用df.columns = ['a', 'b', 'c', 'd', 'e'] ？

解决方案6 148 2014-03-26 10:20:45

解决方案7 130 2020-05-08 12:34:49

解决方案8 90 2016-03-22 08:59:12

解决方案9 76 2015-05-21 17:48:33

解决方案10 41 2016-09-29 12:30:40

列名与系列名称

命名列列表

挥之不去的文物

但

多级列名

解决方案11 41 2017-09-13 08:09:23

一条线或管道解决方案

解决方案12 31 2020-03-08 05:35:42

解决方案13 30 2021-05-10 08:17:06

解决方案14 25 2019-08-27 08:30:27

解决方案15 22 2021-06-15 00:38:13

解决方案16 21 2018-07-19 04:50:15

解决方案17 20 2015-09-01 02:24:17

解决方案18 20 2016-02-14 00:31:53

解决方案19 18 2015-11-23 13:56:10

解决方案20 18 2015-11-29 19:22:47

解决方案21 18 2021-06-10 03:46:32

解决方案22 15 2021-07-14 02:09:30

解决方案23 12 2016-01-28 17:31:39

解决方案24 12 2018-07-07 02:07:23

解决方案25 11 2016-08-04 20:26:50

解决方案26 10 2016-08-29 21:27:20

解决方案27 9 2017-06-16 08:27:37

解决方案28 9 2021-03-19 10:29:52

解决方案29 8 2020-03-08 15:43:28

解决方案30 6 2018-04-19 07:48:53

解决方案31 6 2019-04-11 15:08:57

解决方案32 6 2020-06-24 02:42:06

解决方案33 0 2022-07-17 09:23:08

解决方案34 0 2022-09-06 14:14:07

解决方案1
3943 2012-07-06 01:48:15

解决方案2
2432 已采纳 2012-07-05 14:23:27

解决方案3
482 2013-05-21 09:58:59

解决方案4
244 2015-05-30 13:24:05

解决方案5
188 2017-10-24 13:39:15

将`rename`与`axis='columns'`或`axis=1`一起使用

将`set_axis`与列表和 inplace `inplace=False`一起使用

为什么不使用`df.columns = ['a', 'b', 'c', 'd', 'e']` ？

解决方案6
148 2014-03-26 10:20:45

解决方案7
130 2020-05-08 12:34:49

解决方案8
90 2016-03-22 08:59:12

解决方案9
76 2015-05-21 17:48:33

解决方案10
41 2016-09-29 12:30:40

解决方案11
41 2017-09-13 08:09:23

解决方案12
31 2020-03-08 05:35:42

解决方案13
30 2021-05-10 08:17:06

解决方案14
25 2019-08-27 08:30:27

解决方案15
22 2021-06-15 00:38:13

解决方案16
21 2018-07-19 04:50:15

解决方案17
20 2015-09-01 02:24:17

解决方案18
20 2016-02-14 00:31:53

解决方案19
18 2015-11-23 13:56:10

解决方案20
18 2015-11-29 19:22:47

解决方案21
18 2021-06-10 03:46:32

解决方案22
15 2021-07-14 02:09:30

解决方案23
12 2016-01-28 17:31:39

解决方案24
12 2018-07-07 02:07:23

解决方案25
11 2016-08-04 20:26:50

解决方案26
10 2016-08-29 21:27:20

解决方案27
9 2017-06-16 08:27:37

解决方案28
9 2021-03-19 10:29:52

解决方案29
8 2020-03-08 15:43:28

解决方案30
6 2018-04-19 07:48:53

解决方案31
6 2019-04-11 15:08:57

解决方案32
6 2020-06-24 02:42:06

解决方案33
0 2022-07-17 09:23:08

解决方案34
0 2022-09-06 14:14:07