Python Pandas数据框：如何对两个具有相同名称的列执行操作

Question

Say you have a data frame like the one which follows (notice that some columns have the same name): 假设您有一个如下的数据框（请注意，有些列具有相同的名称）：

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(4,5), columns = list('abcab'))

The issue is if you want to perform some operations on the two columns 'a', how do you do this since they have the same name? 问题是如果你想对两个列'a'执行一些操作，你怎么做，因为它们具有相同的名称？ I tried to use the replace() and rename() method to rename one of the two columns and then perform some operations but I didn't manage to do this on only one column. 我尝试使用replace（）和rename（）方法重命名两列之一，然后执行一些操作，但是我没有设法仅对一列进行此操作。

Answer 1

您应该能够执行以下操作更改列的标签：

df.columns = ['a', 'b', 'c', 'd', 'e']

Answer 2

You can use iloc if you dont want rename columns: 如果您不想重命名列，可以使用iloc ：

import numpy as np
import pandas as pd

np.random.seed(0)
df = pd.DataFrame(np.random.rand(4,5), columns = list('abcab'))
print df
          a         b         c         a         b
0  0.548814  0.715189  0.602763  0.544883  0.423655
1  0.645894  0.437587  0.891773  0.963663  0.383442
2  0.791725  0.528895  0.568045  0.925597  0.071036
3  0.087129  0.020218  0.832620  0.778157  0.870012

#select first a column
print df.iloc[:,0]
0    0.548814
1    0.645894
2    0.791725
3    0.087129
Name: a, dtype: float64

#select second a column
print df.iloc[:,3]
Name: a, dtype: float64
0    0.544883
1    0.963663
2    0.925597
3    0.778157
Name: a, dtype: float64

#select first a column
print df['a'].iloc[:,0]
0    0.548814
1    0.645894
2    0.791725
3    0.087129
Name: a, dtype: float64

#select second a column
print df['a'].iloc[:,1]
0    0.544883
1    0.963663
2    0.925597
3    0.778157
Name: a, dtype: float64

EDIT: If you need only rename columns with same names, use get_loc : 编辑：如果只需要重命名具有相同名称的列，请使用get_loc ：

import numpy as np
import pandas as pd

np.random.seed(0)
df = pd.DataFrame(np.random.rand(4,5), columns = list('abbab'))
print df
          a         b         b         a         b
0  0.548814  0.715189  0.602763  0.544883  0.423655
1  0.645894  0.437587  0.891773  0.963663  0.383442
2  0.791725  0.528895  0.568045  0.925597  0.071036
3  0.087129  0.020218  0.832620  0.778157  0.870012

cols=pd.Series(df.columns)
for dup in df.columns.get_duplicates():
    cols[df.columns.get_loc(dup)]=[dup+'_'+str(d_idx) if d_idx!=0 else dup for d_idx in range(df.columns.get_loc(dup).sum())]
df.columns=cols
print df
          a         b       b_1       a_1       b_2
0  0.548814  0.715189  0.602763  0.544883  0.423655
1  0.645894  0.437587  0.891773  0.963663  0.383442
2  0.791725  0.528895  0.568045  0.925597  0.071036
3  0.087129  0.020218  0.832620  0.778157  0.870012

Python Pandas数据框：如何对两个具有相同名称的列执行操作

问题描述

2 个解决方案

解决方案1
0 2016-04-01 08:01:11

解决方案2
0 已采纳 2016-04-01 08:05:10

Python Pandas数据框：如何对两个具有相同名称的列执行操作

问题描述

2 个解决方案

解决方案1 0 2016-04-01 08:01:11

解决方案2 0 已采纳 2016-04-01 08:05:10

解决方案1
0 2016-04-01 08:01:11

解决方案2
0 已采纳 2016-04-01 08:05:10