[英]Python pandas dataframe - find the first occurrence that is greater than a specific value
I have a dataframe that looks like this:我有一个看起来像这样的 dataframe:
print(df.head(20))
date_final Close* year
18399 1949-08-08 15.51 1949
18398 1949-08-09 15.37 1949
18397 1949-08-10 15.44 1949
18396 1949-08-11 15.39 1949
18395 1949-08-12 15.32 1949
18394 1949-08-15 15.25 1949
18393 1949-08-16 15.29 1949
18392 1949-08-17 15.46 1949
18391 1949-08-18 15.50 1949
18390 1949-08-19 15.41 1949
18389 1949-08-22 15.37 1949
18388 1949-08-23 15.17 1949
18387 1949-08-24 15.18 1949
18386 1949-08-25 15.22 1949
18385 1949-08-26 15.28 1949
18384 1949-08-29 15.12 1949
18383 1949-08-30 15.21 1949
18382 1949-08-31 15.22 1949
18381 1949-09-01 15.31 1949
18380 1949-09-02 15.29 1949
I have daily data for many years.我有很多年的每日数据。
I want to find the min and the max (max1) per year.我想找到每年的最小值和最大值(max1)。 Then I want to find when I had a new max, ie when max2 becomes max1.
然后我想知道我什么时候有一个新的最大值,即 max2 什么时候变成 max1。 I want to find the row that this happens
我想找到发生这种情况的行
I am able to find the max and the min using the following code我可以使用以下代码找到最大值和最小值
df_1974.loc[[df_1974["Close*"].idxmax()]]['date_final']
df_1974.loc[[df_1974["Close*"].idxmax()]]['Close*']
but I need some help to proceed.但我需要一些帮助才能继续。 Thanks.
谢谢。
Might suggest creating DataFrame
with the Max Values, then finding the due max entries:可能建议使用最大值创建
DataFrame
,然后找到到期的最大条目:
df = pd.DataFrame(
data=(
('1949-08-08', 15.51),
('1949-08-09', 15.37),
('1949-08-10', 15.44),
('1949-08-11', 15.39),
('1949-08-12', 15.32),
('1949-08-15', 15.25),
('1949-08-16', 15.29),
('1949-08-17', 15.46),
('1949-08-18', 15.5),
('1949-08-19', 15.41),
('1949-08-22', 15.37),
('1949-08-23', 15.17),
('1949-08-24', 15.18),
('1949-08-25', 15.22),
('1949-08-26', 15.28),
('1949-08-29', 15.12),
('1949-08-30', 15.21),
('1949-08-31', 15.22),
('1949-09-01', 15.31),
('1949-09-02', 15.29),
),
columns=('date_final', 'Close*'),
index=range(18399, 18379, -1),
)
df.set_index(pd.to_datetime(df.iloc[:, 0], format='%Y-%m-%d'), inplace=True)
df.drop('date_final', axis=1, inplace=True)
maxes = df.groupby(df.index.year).max()
for _ in range(maxes.shape[0]):
print(df[np.logical_and(
df.index.year == maxes.index[_],
df.iloc[:, 0] == maxes.iloc[_, 0])])
You may not use print
but to use pd.concat
to retrieve the resulting DataFrame
.您可能不使用
print
而是使用pd.concat
来检索生成的DataFrame
。 Hope that helps.希望有帮助。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.