简体   繁体   English

如何在pandas数据帧中找到行的iloc?

[英]How do i find the iloc of a row in pandas dataframe?

I have an indexed pandas dataframe. 我有一个索引的pandas数据帧。 By searching through its index, I find a row of interest. 通过搜索其索引,我发现了一排感兴趣。 How do I find out the iloc of this row? 我如何找到这一行的iloc?

Example: 例:

dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df
                   A         B         C         D
2000-01-01 -0.077564  0.310565  1.112333  1.023472
2000-01-02 -0.377221 -0.303613 -1.593735  1.354357
2000-01-03  1.023574 -0.139773  0.736999  1.417595
2000-01-04 -0.191934  0.319612  0.606402  0.392500
2000-01-05 -0.281087 -0.273864  0.154266  0.374022
2000-01-06 -1.953963  1.429507  1.730493  0.109981
2000-01-07  0.894756 -0.315175 -0.028260 -1.232693
2000-01-08 -0.032872 -0.237807  0.705088  0.978011

window_stop_row = df[df.index < '2000-01-04'].iloc[-1]
window_stop_row
Timestamp('2000-01-08 00:00:00', offset='D')
#which is the iloc of window_stop_row?

You want the .name attribute and pass this to get_loc : 您需要.name属性并将其传递给get_loc

In [131]:
dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df

Out[131]:
                   A         B         C         D
2000-01-01  0.095234 -1.000863  0.899732 -1.742152
2000-01-02 -0.517544 -1.274137  1.734024 -1.369487
2000-01-03  0.134112  1.964386 -0.120282  0.573676
2000-01-04 -0.737499 -0.581444  0.528500 -0.737697
2000-01-05 -1.777800  0.795093  0.120681  0.524045
2000-01-06 -0.048432 -0.751365 -0.760417 -0.181658
2000-01-07 -0.570800  0.248608 -1.428998 -0.662014
2000-01-08 -0.147326  0.717392  3.138620  1.208639

In [133]:    
window_stop_row = df[df.index < '2000-01-04'].iloc[-1]
window_stop_row.name

Out[133]:
Timestamp('2000-01-03 00:00:00', offset='D')

In [134]:
df.index.get_loc(window_stop_row.name)

Out[134]:
2

get_loc returns the ordinal position of the label in your index which is what you want: get_loc返回索引中标签的序号位置,这是您想要的:

In [135]:    
df.iloc[df.index.get_loc(window_stop_row.name)]

Out[135]:
A    0.134112
B    1.964386
C   -0.120282
D    0.573676
Name: 2000-01-03 00:00:00, dtype: float64

if you just want to search the index then so long as it is sorted then you can use searchsorted : 如果你只是想搜索索引,那么只要它被排序,那么你可以使用searchsorted

In [142]:
df.index.searchsorted('2000-01-04') - 1

Out[142]:
2

While pandas.Index.get_loc() will only work if you have a single key, the following paradigm will also work getting the iloc of multiple elements: 虽然pandas.Index.get_loc()仅在您拥有单个键时才有效,但以下范例也可用于获取多个元素的iloc

np.argwhere(condition).flatten()   # array of all iloc where condition is True

In your case, picking the latest element where df.index < '2000-01-04' : 在您的情况下,选择df.index < '2000-01-04'的最新元素:

np.argwhere(df.index < '2000-01-04').flatten()[-1]  # returns 2

You could try looping through each row in the dataframe: 您可以尝试循环遍历数据框中的每一行:

    for row_number,row in dataframe.iterrows():
        if row['column_header'] == YourValue:
            print row_number

This will give you the row with which to use the iloc function 这将为您提供使用iloc函数的行

IIUC you could call index for your case: 您可以为您的案件调用索引的IIUC:

In [53]: df[df.index < '2000-01-04'].index[-1]
Out[53]: Timestamp('2000-01-03 00:00:00', offset='D') 

EDIT 编辑

I think @EdChums answer is what you want. 我想@EdChums答案就是你想要的。 Alternatively you could filter your dataframe with values which you get, then use all to find the row with that values and then pass it to the index : 或者,您可以使用获得的值过滤数据框,然后使用all查找具有该值的行,然后将其传递给index

In [67]: df == window_stop_row
Out[67]:
                A      B      C      D
2000-01-01  False  False  False  False
2000-01-02  False  False  False  False
2000-01-03   True   True   True   True
2000-01-04  False  False  False  False
2000-01-05  False  False  False  False
2000-01-06  False  False  False  False
2000-01-07  False  False  False  False
2000-01-08  False  False  False  False

In [68]: (df == window_stop_row).all(axis=1)
Out[68]:
2000-01-01    False
2000-01-02    False
2000-01-03     True
2000-01-04    False
2000-01-05    False
2000-01-06    False
2000-01-07    False
2000-01-08    False
Freq: D, dtype: bool

In [69]: df.index[(df == window_stop_row).all(axis=1)]
Out[69]: DatetimeIndex(['2000-01-03'], dtype='datetime64[ns]', freq='D')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM