[英]how to select oldest record of each group in a dataframe? using python
As an example, a dataframe that looks like this:例如,dataframe 如下所示:
date price ticker volume
0 2018-01-01 1.323 AI 2000
1 2018-01-02 1.525 AI 1500
2 2018-01-03 1.045 AI 500
3 2018-01-04 1.845 AI 600
4 2018-01-05 1.045 AI 500
5 2018-01-02 1.446 BOC 550
6 2018-01-03 2.110 BOC 3201
7 2018-01-04 2.150 BOC 5200
8 2018-01-05 2.810 BOC 1980
9 2018-01-03 5.199 CAT 2000
10 2018-01-06 4.980 CAT 450
11 2018-01-07 4.990 CAT 3000
I am going to ask a very basic question, please bear with me how can choose the first two ticker that has oldest date, to have a dataframe like below我要问一个非常基本的问题,请耐心等待我如何选择具有最旧日期的前两个股票代码,以获得如下所示的 dataframe
date price ticker volume
0 2018-01-01 1.323 AI 2000
1 2018-01-02 1.525 AI 1500
5 2018-01-01 1.446 BOC 550
6 2018-01-02 2.110 BOC 3201
9 2018-01-01 5.199 CAT 2000
10 2018-01-02 4.980 CAT 450
On Pandas, you can use groupby
command to group values.在 Pandas 上,您可以使用
groupby
命令对值进行分组。 Also, by using head
command with groupby
command, you can select first two values in the group.此外,通过使用
head
命令和groupby
命令,您可以 select 组中的前两个值。 So, in your case, to group first two ticker, the command will be:因此,在您的情况下,要将前两个股票代码分组,命令将是:
df.sort_values('date').groupby('ticker').head(2)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.