简体   繁体   English

使用数据框另一部分中的数据编辑熊猫数据框中的值

[英]Editing values in a pandas dataframe using data from another part of the dataframe

I was hoping to use indexing on one part of a Pandas DataFrame to edit values corresponding to another index. 我希望在Pandas DataFrame的一部分上使用索引来编辑与另一索引对应的值。 Here is an example: 这是一个例子:

>>> from pandas import *
>>> from numpy.random import randn
>>> x = DataFrame(randn(3, 3), columns=[1, 2, 3], index=['a', 'b', 'c'])
>>> print x

        1         2         3
a -1.007344  0.234990  0.772736
b  0.658360  1.330051 -0.269388
c  0.010871  1.035687  0.230169


>>> index1 = x.index[0:2]
>>> index2 = x.index[1:3]
>>> y = x
>>> x.loc[index1, 3] = x.loc[index2, 2]
>>> print x


        1         2         3
a -1.007344  0.234990       NaN
b  0.658360  1.330051  1.330051
c  0.010871  1.035687  0.230169

Where the latter output is rather unexpected. 后者的输出相当意外。 What does work instead is the following: 起作用的是以下内容:

>>> y.loc[index1, 3] = y.loc[index2, 2].values
>>> print y

       1         2         3
a -1.007344  0.234990  1.330051
b  0.658360  1.330051  1.035687
c  0.010871  1.035687  0.230169

However, this latter solution is inconvenient for a number of applications I would like to use. 但是,对于我想使用的许多应用程序,后一种解决方案不方便。 For example, I would like to write: 例如,我想写:

x.loc[index1, 3] = x.loc[index2, 2]+2 

or 要么

x.loc[index1, 3] = x.loc[index1, 3] + x.loc[index2, 2]

etc. 等等

Is there another way around this problem? 是否有其他方法可以解决此问题?

Thanks in advance! 提前致谢!

Pandas is great for aligning based on index. 熊猫非常适合根据索引进行对齐。 The "unexpected" result is actually understandable if you think of 如果您想到“意外”的结果,实际上是可以理解的

x.loc[index1, 3]

as a Series with index labels ['a', 'b'] and assignment 带有索引标签['a', 'b']和分配的系列

x.loc[index1, 3] = x.loc[index2, 2]

is assigning new values from x.loc[index2, 2] which is a Series with index labels ['b', 'c'] . x.loc[index2, 2]分配新值, x.loc[index2, 2]是具有索引标签['b', 'c']的Series。 Since the data on the right-hand side only aligns with the Series on the left at the label 'b' , that label gets a new value, while the label a is set to NaN , since the right-hand side has no value for that index. 由于右侧的数据仅标签'b'上的左侧“系列” 对齐 ,因此标签a设置为NaN ,该标签将获得新值,因为右侧没有用于该索引。

When you want Pandas to disregard the index, you need to pass an object on the right-hand side that has no index. 当您希望熊猫忽略索引时,您需要在右侧传递没有索引的对象。 So, as you showed, 因此,正如您所展示的,

y.loc[index1, 3] = y.loc[index2, 2].values

produces the desired result. 产生期望的结果。

Similarly, for your more complicated assignments, you could use 同样,对于更复杂的任务,您可以使用

x.loc[index1, 3] = x.loc[index2, 2].values + 2

or 要么

x.loc[index1, 3] += x.loc[index2, 2].values

(Note the second assignment uses the in-place addition operator, += .) (请注意,第二个分配使用就地加法运算符+= 。)

If you have a lot of assignments that ignores the index, then perhaps you should be using a NumPy array instead of a Pandas DataFrame. 如果您有很多忽略索引的分配,那么也许您应该使用NumPy数组而不是Pandas DataFrame。

import pandas as pd
import numpy as np

x = pd.DataFrame(np.arange(9).reshape((3, 3)), columns=[1, 2, 3], index=['a', 'b', 'c'])
arr = x.values
print(arr)
# [[0 1 2]
#  [3 4 5]
#  [6 7 8]]

index1 = slice(0,2)
index2 = slice(1,3)
arr[index1, 2] = arr[index2, 1]
print(arr)
# [[0 1 4]
#  [3 4 7]
#  [6 7 8]]

# Instead of x.loc[index1, 3] = x.loc[index2, 2]+2 
arr[index1, 2] = arr[index2, 1] + 2
print(arr)
# [[0 1 6]
#  [3 4 9]
#  [6 7 8]]

# Instead of x.loc[index1, 3] = x.loc[index1, 3] + x.loc[index2, 2]
arr[index1, 2] += arr[index2, 1]
print(arr)
# [[ 0  1 10]
#  [ 3  4 16]
#  [ 6  7  8]]

x.loc[:,:] = arr
print(x)
#    1  2   3
# a  0  1  10
# b  3  4  16
# c  6  7   8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas 使用另一部分分配 dataframe 的一部分 - Pandas assigning part of a dataframe using another part 根据来自另一个数据框的数据将值分配给Pandas数据框中的列 - Assign values to columns in Pandas Dataframe based on data from another dataframe 使用来自另一个 dataframe 的值将 pandas 条样式应用于 dataframe - Applying pandas bar style to a dataframe using values from another dataframe Gapfill pandas dataframe 在另一个 dataframe 中使用同一列中的值 - Gapfill pandas dataframe using values from same column in another dataframe 如何使用熊猫中另一个数据框的值更新一个数据框 - How to update one dataframe using values from another dataframe in pandas 使用 pandas 从一个 dataframe 在另一个 dataframe 中搜索值 - searching values from one dataframe in another dataframe using pandas 使用来自另一个数据框的值更新熊猫数据框 - Update pandas dataframe with values from another dataframe 使用Pandas将数据框中的值替换为另一个数据框 - Replace values in dataframe from another dataframe with Pandas Pandas:使用另一个数据帧中的值分配值 - Pandas: Assigning values using values from another dataframe 在 Pandas 数据框中查找值并将数据插入到另一个 Pandas 数据框的列中 - Find values in a Pandas dataframe and insert the data in a column of another Pandas dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM