使用列邏輯索引將行復制到 python 數據框中

Question

各位程序員，大家好

我正在嘗試使用邏輯索引將行復制到數據幀中，但我得到了 nans。 這個想法是使用邏輯索引快速替換數據框中的許多行：

例如我期望它如何工作：

# pseudo code: example to create logical vector
logical_vector = df.loc[:,'colname']==x
# pseudo code: example to use logical vector to index a dataframe
df[logical_vector,:]=df1.loc[logical_vector2,:]

如果可能，這是允許快速矩陣運算的基本運算之一。 如何解決這個問題，最好沒有循環？

我創建了一個示例來說明問題所在：

# create example 10x5 dataframe containing random numbers
x=pd.DataFrame(np.random.randn(10,5),columns=['ac', 'bc','cc','dc', 
'ec'],index = ['a','b','c','d','e','f','g','h','i','j'])

# add a column containing information for use in example of logical indexing
x['t']= [1,0,1,0,1,0,1,0,1,0]

x
Out[76]: 
         ac        bc        cc        dc        ec  t
a -1.029517  1.936904  1.143655  0.708996 -1.218484  1
b  1.836638 -0.723243 -0.501546 -2.046355  0.248156  0
c  2.369828  0.559880 -0.878904  0.673454 -0.630927  1
d -0.629210  1.261608 -0.190508 -0.582700  0.068166  0
e  1.500134  0.534379  0.375362  0.849761 -1.675824  1
f  1.399520  0.038366 -0.137986  0.156580 -0.674619  0
g -1.359863  0.433721 -0.625973 -0.477530 -0.542612  1
h -0.694573 -0.196907 -0.372210  0.464188 -1.217399  0
i  1.357809 -0.017611  0.539137 -1.016894  0.172672  1
j  0.366195  0.750404 -0.055895  0.358795  0.181593  0

然后我嘗試使用索引替換我得到這個：

# Use logical indexes x.loc[:,'t']==0 and x.loc[:,'t']==1 to point and get
# data into x. This should replace all row values that contain '0' in column
# t with row values from columns that have '1' for column t
x.loc[x.loc[:,'t']==0,:]=x.loc[x.loc[:,'t']==1,:]

x
Out[78]: 
         ac        bc        cc        dc        ec    t
a -1.029517  1.936904  1.143655  0.708996 -1.218484  1.0
b       NaN       NaN       NaN       NaN       NaN  NaN
c  2.369828  0.559880 -0.878904  0.673454 -0.630927  1.0
d       NaN       NaN       NaN       NaN       NaN  NaN
e  1.500134  0.534379  0.375362  0.849761 -1.675824  1.0
f       NaN       NaN       NaN       NaN       NaN  NaN
g -1.359863  0.433721 -0.625973 -0.477530 -0.542612  1.0
h       NaN       NaN       NaN       NaN       NaN  NaN
i  1.357809 -0.017611  0.539137 -1.016894  0.172672  1.0
j       NaN       NaN       NaN       NaN       NaN  NaN

雖然我期待這個：

Out[76]: 
         ac        bc        cc        dc        ec  t
a -1.029517  1.936904  1.143655  0.708996 -1.218484  1
b -1.029517  1.936904  1.143655  0.708996 -1.218484  1
c  2.369828  0.559880 -0.878904  0.673454 -0.630927  1
d  2.369828  0.559880 -0.878904  0.673454 -0.630927  1
e  1.500134  0.534379  0.375362  0.849761 -1.675824  1
f  1.500134  0.534379  0.375362  0.849761 -1.675824  1
g -1.359863  0.433721 -0.625973 -0.477530 -0.542612  1
h -1.359863  0.433721 -0.625973 -0.477530 -0.542612  1
i  1.357809 -0.017611  0.539137 -1.016894  0.172672  1
j  1.357809 -0.017611  0.539137 -1.016894  0.172672  1

我錯過了什么？

Answer 1

好問題，問題是右手邊的索引與左手邊的不匹配。 下面是一個更簡單的例子來解決它：

df=pd.DataFrame({'a':[1,0,1,0],'b':[2,0,2,0]})

   a  b
0  1  2
1  0  0
2  1  2
3  0  0

df.loc[df['a']==0,:]=df.loc[df['a']==1,:].set_index(df.loc[df['a']==0,:].index)

   a  b
0  1  2
1  1  2
2  1  2
3  1  2

或者，如果您知道形狀相同，則可以簡單地取值：

df=pd.DataFrame({'a':[1,0,1,0],'b':[2,0,2,0]})
df.loc[df['a']==0,:]=df.loc[df['a']==1,:].values

使用列邏輯索引將行復制到 python 數據框中

問題描述

1 個解決方案

解決方案1
0 已采納 2017-10-27 09:33:08

使用列邏輯索引將行復制到 python 數據框中

問題描述

1 個解決方案

解決方案1 0 已采納 2017-10-27 09:33:08

解決方案1
0 已采納 2017-10-27 09:33:08