[英]How to find row index of one column based on values of a different column where the values in the two distinct columns are equal in pandas?
I have some pandas output: 我有一些熊猫输出:
seq X1 X2
0 0.59 NaN
1 -1.28 NaN
2 -1.26 NaN
3 -0.79 NaN
4 1.03 NaN
5 -1.43 NaN
6 0.03 1.03
7 0.92 1.03
8 -2.21 1.03
How do I get a third column X3, that takes the 1.03 in X2 and finds the seq number associated with the same number in column X1? 如何获得第三列X3,该列在X2中为1.03,并在X1列中找到与相同编号相关联的seq号? In my example starting from row7(row index 6), X3 should return a 4, since seq = 4 when X1 is 1.03.
在我的示例中,从row7(行索引6)开始,X3应该返回4,因为当X1为1.03时seq = 4。
I desire: 我渴望:
seq X1 X2 X3
0 0.59 NaN NaN
1 -1.28 NaN NaN
2 -1.26 NaN NaN
3 -0.79 NaN NaN
4 1.03 NaN NaN
5 -1.43 NaN NaN
6 0.03 1.03 4
7 0.92 1.03 4
8 -2.21 1.03 4
First ever Stack question. 有史以来第一个堆栈问题。 Pardon my folly!
请原谅我的愚蠢!
Can you explain why you want the number 4
to appear in all rows of X3
? 您能解释一下为什么要让数字
4
出现在X3
所有行中吗?
You can get the seq value ( 4
) where X1 == 1.03
by typing: 您可以通过键入以下命令获得seq值(
4
),其中X1 == 1.03
:
df.loc[df['X1']==1.03, 'seq'].values[0]
But that only gives you one 4. Note that I've taken the first seq value (by typing [0]
) because if you have more than one place where X1 == 1.03
a list of numbers will be returned (as a data frame) and you haven't explained how you'd like to deal with multiple seq matches. 但这只给您4。请注意,我采用了第一个seq值(通过键入
[0]
),因为如果您在X1 == 1.03
的多个位置上,将返回数字列表(作为数据帧) ),而您尚未说明如何处理多个seq匹配。
The following code will run and return the data frame you requested, but I suggest you spend a little time thinking about whether you need X2
and X3
to be part of the data frame at all... 以下代码将运行并返回您请求的数据帧,但是我建议您花一些时间考虑是否需要
X2
和X3
完全成为数据帧的一部分...
# Import what you need
import pandas as pd
import numpy as np
# Define the data
x1 = np.array([0.59, -1.28, -1.27, -0.79, \
1.03, -1.43, 0.03, 0.92, -2.21])
x2 = np.array([np.nan, np.nan, np.nan, np.nan, \
np.nan, np.nan, 1.03, 1.03, 1.03])
# Create a pandas dataframe
df = pd.DataFrame( { 'seq' : range(9),
'X1' : x1,
'X2' : x2 } )
# Figure out where the first instance of X1==1.03
# occurs and grab that seq value
s_first = df.loc[df['X1']==1.03,'seq'].values[0]
# Fill in X3 according to the values in X2
df.loc[df['X2'].isnull(), 'X3'] = np.nan
df.loc[df['X2'].notnull(), 'X3'] = s_first
# Show the 9 rows in the data frame
df.head(9)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.