简体   繁体   English

用具有相同索引但顺序不同的另一列替换Pandas数据框中的一列

[英]Replace a column in Pandas dataframe with another that has same index but in a different order

I'm trying to re-insert back into a pandas dataframe a column that I extracted and of which I changed the order by sorting it. 我正在尝试将提取的列重新插入到pandas数据框中,并对其排序进行了更改。

Very simply, I have extracted a column from a pandas df: 很简单,我从pandas df中提取了一列:

col1 = df.col1

This column contains integers and I used the .sort() method to order it from smallest to largest. 此列包含整数,我使用.sort()方法将其从最小到最大排序。 And did some operation on the data. 并对数据做了一些操作。

col1.sort()
#do stuff that changes the values of col1.

Now the indexes of col1 are the same as the indexes of the overall df, but in a different order. 现在,col1的索引与整个df的索引相同,但是顺序不同。

I was wondering how I can insert the column back into the original dataframe (replacing the col1 that is there at the moment) 我想知道如何将列插入回原始数据帧中(替换目前的col1)

I have tried both of the following methods: 我尝试了以下两种方法:

1) 1)

df.col1 = col1

2) 2)

df.insert(column_index_of_col1, "col1", col1)

but both methods give me the following error: 但是两种方法都给我以下错误:

ValueError: cannot reindex from a duplicate axis

Any help will be greatly appreciated. 任何帮助将不胜感激。 Thank you. 谢谢。

Consider this DataFrame: 考虑以下DataFrame:

df = pd.DataFrame({'A': [1, 2, 3], 'B': [6, 5, 4]}, index=[0, 0, 1])

df
Out: 
   A  B
0  1  6
0  2  5
1  3  4

Assign the second column to b and sort it and take the square, for example: 将第二列分配给b并对其进行排序并取平方,例如:

b = df['B']
b = b.sort_values()
b = b**2

Now b is: 现在b是:

b
Out: 
1    16
0    25
0    36
Name: B, dtype: int64

Without knowing the exact operation you've done on the column, there is no way to know whether 25 corresponds to the first row in the original DataFrame or the second one. 如果不知道您对列所做的确切操作,就无法知道25是对应于原始DataFrame的第一行还是第二行。 You can take the inverse of the operation (take the square root and match, for example) but that would be unnecessary I think. 您可以取反运算(例如,取平方根并匹配),但是我认为这是不必要的。 If you start with an index that has unique elements ( df = df.reset_index() ) it would be much easier. 如果您从具有唯一元素的索引开始( df = df.reset_index() ),它将容易得多。 In that case, 在这种情况下,

df['B'] = b

should work just fine. 应该工作正常。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用另一个 DataFrame 替换 pandas 多索引 DataFrame 的列 - Replace column of pandas multi-index DataFrame with another DataFrame Pandas DataFrame 列(系列)的索引与 Dataframe 不同? - Pandas DataFrame column (Series) has different index than the Dataframe? Pandas:将系列添加到数据框作为列(相同的索引,不同的长度) - Pandas: Add series to dataframe as a column (same index, different length) Python pandas 用模式(同一列 -A)相对于 Pandas 数据帧中的另一列替换一列(A)的 NaN 值 - Python pandas replace NaN values of one column(A) by mode (of same column -A) with respect to another column in pandas dataframe 将 DataFrame 中某些列和行的值替换为同一 dataframe 和 Pandas 中的另一列的值 - Replace values of certain column and rows in a DataFrame with the value of another column in the same dataframe with Pandas Pandas使用pandas数据框索引来更新同一索引上的另一个数据框 - Pandas Using index of pandas dataframe to update another dataframe on same index 按照另一个索引的顺序对Pandas Dataframe进行排序 - Sorting Pandas Dataframe by order of another index 将 Pandas Dataframe 中的行按索引替换为另一个 Dataframe 中具有相应索引的值 - Replace rows by index in a Pandas Dataframe with values with corresponding index in another Dataframe 在 pandas 索引中旋转,并且列具有相同的列 - pivotting in pandas index and column has same column 在索引重叠上用另一个 dataframe 中的另一个列更新 pandas dataframe 列 - Update pandas dataframe column with another column in another dataframe on index overlap
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM