简体   繁体   English

如何为每个列名添加后缀(或前缀)?

[英]How to add a suffix (or prefix) to each column name?

I want to add _x suffix to each column name like so:我想为每个列名添加_x后缀,如下所示:

featuresA = myPandasDataFrame.columns.values + '_x'

How do I do this?我该怎么做呢? Additionally, if I wanted to add x_ as a suffix, how would the solution change?此外,如果我想添加x_作为后缀,解决方案将如何变化?

The following is the nicest way to add suffix in my opinion.以下是我认为添加后缀的最佳方式。

df = df.add_suffix('_some_suffix')

As it is a function that is called on DataFrame and returns DataFrame - you can use it in chain of the calls.因为它是在 DataFrame 上调用并返回 DataFrame 的函数 - 您可以在调用链中使用它。

You can use a list comprehension:您可以使用list推导:

df.columns = [str(col) + '_x' for col in df.columns]

There are also built-in methods like .add_suffix() and .add_prefix() as mentioned in another answer.如另一个答案中所述,还有内置方法,如.add_suffix().add_prefix()

Elegant In-place Concatenation优雅的就地串联

If you're trying to modify df in-place, then the cheapest (and simplest) option is in-place addition directly on df.columns (ie, using Index.__iadd__ ).如果您尝试就地修改df ,那么最便宜(也是最简单)的选项是直接在df.columns上就地添加(即,使用Index.__iadd__ )。

df = pd.DataFrame({"A": [9, 4, 2, 1], "B": [12, 7, 5, 4]})
df

   A   B
0  9  12
1  4   7
2  2   5
3  1   4

df.columns += '_some_suffix'
df

   A_some_suffix  B_some_suffix
0              9             12
1              4              7
2              2              5
3              1              4

To add a prefix, you would similarly use要添加前缀,您将类似地使用

df.columns = 'some_prefix_' + df.columns
df

   some_prefix_A  some_prefix_B
0              9             12
1              4              7
2              2              5
3              1              4

Another cheap option is using a list comprehension with f-string formatting (available on python3.6+).另一个便宜的选择是使用带有f-string格式的列表理解(在 python3.6+ 上可用)。

df.columns = [f'{c}_some_suffix' for c in df]
df

   A_some_suffix  B_some_suffix
0              9             12
1              4              7
2              2              5
3              1              4

And for prefix, similarly,对于前缀,类似地,

df.columns = [f'some_prefix{c}' for c in df]

Method Chaining方法链

It is also possible to do add *fixes while method chaining.也可以在方法链接时添加 *fixes。 To add a suffix, use DataFrame.add_suffix要添加后缀,请使用DataFrame.add_suffix

df.add_suffix('_some_suffix')

   A_some_suffix  B_some_suffix
0              9             12
1              4              7
2              2              5
3              1              4

This returns a copy of the data.这将返回数据的副本 IOW, df is not modified. IOW, df未修改。

Adding prefixes is also done with DataFrame.add_prefix .添加前缀也是通过DataFrame.add_prefix完成的。

df.add_prefix('some_prefix_')

   some_prefix_A  some_prefix_B
0              9             12
1              4              7
2              2              5
3              1              4

Which also does not modify df .这也不会修改df


Critique of add_*fixadd_*fix的批评

These are good methods if you're trying to perform method chaining:如果您尝试执行方法链接,这些是很好的方法:

df.some_method1().some_method2().add_*fix(...)

However, add_prefix (and add_suffix ) creates a copy of the entire dataframe, just to modify the headers.但是, add_prefix (和add_suffix )创建整个数据帧的副本,只是为了修改标题。 If you believe this is wasteful, but still want to chain, you can call pipe :如果您认为这很浪费,但仍想链接,您可以调用pipe

def add_suffix(df):
    df.columns += '_some_suffix'
    return df

df.some_method1().some_method2().pipe(add_suffix)

I Know 4 ways to add a suffix (or prefix) to your column's names:我知道在列名称中添加后缀(或前缀)的 4 种方法:

1- df.columns = [str(col) + '_some_suffix' for col in df.columns] 1- df.columns = [str(col) + '_some_suffix' for col in df.columns]

or或者

2- df.rename(columns= lambda col: col+'_some_suffix') 2- df.rename(columns= lambda col: col+'_some_suffix')

or或者

3- df.columns += '_some_suffix' much easiar. 3- df.columns += '_some_suffix'更容易。

or, the nicest:或者,最好的:

3- df.add_suffix('_some_suffix') 3- df.add_suffix('_some_suffix')

I haven't seen this solution proposed above so adding this to the list:我还没有看到上面提出的这个解决方案,所以将它添加到列表中:

df.columns += '_x'

And you can easily adapt for the prefix scenario.您可以轻松适应前缀场景。

Using DataFrame.rename使用DataFrame.rename

df = pd.DataFrame({'A': range(3), 'B': range(4, 7)})
print(df)
   A  B
0  0  4
1  1  5
2  2  6

Using rename with axis=1 and string formatting:使用带有axis=1和字符串格式的rename

df.rename('col_{}'.format, axis=1)
# or df.rename(columns='col_{}'.format)

   col_A  col_B
0      0      4
1      1      5
2      2      6

To actually overwrite your column names, we can assign the returned values to our df :要实际覆盖您的列名,我们可以将返回的值分配给我们的df

df = df.rename('col_{}'.format, axis=1)

or use inplace=True :或使用inplace=True

df.rename('col_{}'.format, axis=1, inplace=True)

I figured that this is what I would use quite often, for example:我想这是我经常使用的,例如:

df = pd.DataFrame({'silverfish': range(3), 'silverspoon': range(4, 7),
                   'goldfish': range(10, 13),'goldilocks':range(17,20)})

My way of dynamically renaming:我的动态重命名方式:

color_list = ['gold','silver']

for i in color_list:
    df[f'color_{i}']=df.filter(like=i).sum(axis=1)

OUTPUT:输出:

{'silverfish': {0: 0, 1: 1, 2: 2},
 'silverspoon': {0: 4, 1: 5, 2: 6},
 'goldfish': {0: 10, 1: 11, 2: 12},
 'goldilocks': {0: 17, 1: 18, 2: 19},
 'color_gold': {0: 135, 1: 145, 2: 155},
 'color_silver': {0: 20, 1: 30, 2: 40}}

Pandas 还有一个add_prefix 方法和一个add_suffix 方法来执行此操作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM