如何在pandas数据帧中找到行的iloc?

问题描述 投票:12回答:4

我有一个索引的pandas数据帧。通过搜索其索引,我发现了一排感兴趣。我如何找到这一行的iloc?

例:

dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df
                   A         B         C         D
2000-01-01 -0.077564  0.310565  1.112333  1.023472
2000-01-02 -0.377221 -0.303613 -1.593735  1.354357
2000-01-03  1.023574 -0.139773  0.736999  1.417595
2000-01-04 -0.191934  0.319612  0.606402  0.392500
2000-01-05 -0.281087 -0.273864  0.154266  0.374022
2000-01-06 -1.953963  1.429507  1.730493  0.109981
2000-01-07  0.894756 -0.315175 -0.028260 -1.232693
2000-01-08 -0.032872 -0.237807  0.705088  0.978011

window_stop_row = df[df.index < '2000-01-04'].iloc[-1]
window_stop_row
Timestamp('2000-01-08 00:00:00', offset='D')
#which is the iloc of window_stop_row?
python pandas dataframe
4个回答
16
投票

你想要.name属性并将其传递给get_loc

In [131]:
dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df

Out[131]:
                   A         B         C         D
2000-01-01  0.095234 -1.000863  0.899732 -1.742152
2000-01-02 -0.517544 -1.274137  1.734024 -1.369487
2000-01-03  0.134112  1.964386 -0.120282  0.573676
2000-01-04 -0.737499 -0.581444  0.528500 -0.737697
2000-01-05 -1.777800  0.795093  0.120681  0.524045
2000-01-06 -0.048432 -0.751365 -0.760417 -0.181658
2000-01-07 -0.570800  0.248608 -1.428998 -0.662014
2000-01-08 -0.147326  0.717392  3.138620  1.208639

In [133]:    
window_stop_row = df[df.index < '2000-01-04'].iloc[-1]
window_stop_row.name

Out[133]:
Timestamp('2000-01-03 00:00:00', offset='D')

In [134]:
df.index.get_loc(window_stop_row.name)

Out[134]:
2

get_loc返回索引中标签的序号位置,这是您想要的:

In [135]:    
df.iloc[df.index.get_loc(window_stop_row.name)]

Out[135]:
A    0.134112
B    1.964386
C   -0.120282
D    0.573676
Name: 2000-01-03 00:00:00, dtype: float64

如果你只想搜索索引,那么只要它被排序,那么你可以使用searchsorted

In [142]:
df.index.searchsorted('2000-01-04') - 1

Out[142]:
2

2
投票

虽然pandas.Index.get_loc()只有在你有一个密钥时才能工作,但以下范例也可以用来获取多个元素的iloc

np.argwhere(condition).flatten()   # array of all iloc where condition is True

在你的情况下,选择df.index < '2000-01-04'最新的元素:

np.argwhere(df.index < '2000-01-04').flatten()[-1]  # returns 2

1
投票

您可以尝试循环遍历数据框中的每一行:

    for row_number,row in dataframe.iterrows():
        if row['column_header'] == YourValue:
            print row_number

这将为您提供使用iloc函数的行


1
投票

您可以为您的案件调用索引的IIUC:

In [53]: df[df.index < '2000-01-04'].index[-1]
Out[53]: Timestamp('2000-01-03 00:00:00', offset='D') 

编辑

我想@EdChums答案就是你想要的。或者,您可以使用您获得的值过滤数据框,然后使用all查找具有该值的行,然后将其传递给index

In [67]: df == window_stop_row
Out[67]:
                A      B      C      D
2000-01-01  False  False  False  False
2000-01-02  False  False  False  False
2000-01-03   True   True   True   True
2000-01-04  False  False  False  False
2000-01-05  False  False  False  False
2000-01-06  False  False  False  False
2000-01-07  False  False  False  False
2000-01-08  False  False  False  False

In [68]: (df == window_stop_row).all(axis=1)
Out[68]:
2000-01-01    False
2000-01-02    False
2000-01-03     True
2000-01-04    False
2000-01-05    False
2000-01-06    False
2000-01-07    False
2000-01-08    False
Freq: D, dtype: bool

In [69]: df.index[(df == window_stop_row).all(axis=1)]
Out[69]: DatetimeIndex(['2000-01-03'], dtype='datetime64[ns]', freq='D')
© www.soinside.com 2019 - 2024. All rights reserved.