如何在 pandas 中返回时间序列中最大值的相对索引

Question

Here I have a dataframe, which indicates the volume of different stocks on a period of time.这里我有一个dataframe，表示一段时间内不同股票的成交量。 (Real dataset should be extended to thousands of different stocks and time period is also arbitrary, the df here is just an simplified example) （真实的数据集应该扩展到数千种不同的股票，时间段也是任意的，这里的df只是一个简化的例子）

            d-1     d-2    d-3     d-4
00001.SH    5000    4600    4893   2321
00002.SH    2134    3456    6433   2131
00003.SH    3543    3128    5423   9642
00032.RS    3234    6432    2234   3213
00006.RS    3435    3452    1231   1229
00004.LH    3213    4232    3652   1233

I am attempting to find the relative index of max volume of each stock on the time series so that I could rank those indices I find.我试图在时间序列上找到每只股票最大交易量的相对指数，以便我可以对我找到的那些指数进行排名。 For convenience of ranking operation, I want the index along the time series to be integers, to say, "d-1" is 1, "d-2" is 2, "d-3" is 3,...., and so on为了方便排序操作，我希望时间序列的索引是整数，也就是说，“d-1”是1，“d-2”是2，“d-3”是3，....，等等

for example, for 00001.SH I would like to it returns 1 (d-1), and the final result should be like例如，对于 00001.SH，我希望它返回 1 (d-1)，最终结果应该是

00001.SH    1
00002.SH    3
00003.SH    4
00032.RS    2
00006.RS    2
00004.LH    2

I know it could be completed by loops, but may I ask if there is more efficient way?我知道它可以通过循环来完成，但请问是否有更有效的方法？ Since the dataset is sufficiently large, running loops wastes lots of time.由于数据集足够大，运行循环会浪费大量时间。 Any help is welcome, thanks a lot!欢迎任何帮助，非常感谢！

Answer 1

Use DataFrame.idxmax for columns by maximal values nd then extract digits by Series.str.extract :按最大值对列使用DataFrame.idxmax ，然后按Series.str.extract提取数字：

s = df.idxmax(axis=1).str.extract('(\d+)', expand=False)
print (s)
00001.SH    1
00002.SH    3
00003.SH    4
00032.RS    2
00006.RS    2
00004.LH    2
dtype: object

Or you can first extract and then using idxmax :或者您可以先提取然后使用idxmax ：

df.columns = df.columns.str.extract('(\d+)', expand=False)

#if is posible assign values by length of columns
#df.columns = range(1, len(df.columns) + 1)
print (df)
             1     2     3     4
00001.SH  5000  4600  4893  2321
00002.SH  2134  3456  6433  2131
00003.SH  3543  3128  5423  9642
00032.RS  3234  6432  2234  3213
00006.RS  3435  3452  1231  1229
00004.LH  3213  4232  3652  1233

s = df.idxmax(axis=1)
print (s)
00001.SH    1
00002.SH    3
00003.SH    4
00032.RS    2
00006.RS    2
00004.LH    2
dtype: object

如何在 pandas 中返回时间序列中最大值的相对索引

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-12-21 09:32:56

如何在 pandas 中返回时间序列中最大值的相对索引

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-12-21 09:32:56

解决方案1
2 已采纳 2020-12-21 09:32:56