简体   繁体   中英

adding a column after creating a new dataframe that is grouped

I have a large dataframe, (printed below)..it has Dates, Times, High, Low. The rows are populated for every 5mins..

What I'm trying to do is find the max in the high column for everyday, and return the Date Time High. The sample below just shows a single day. The first problem I had to figure out was to find what the 'High' was for everyday, since there are multiple identical 'Date' rows, but different 'Time' & 'High' rows., the solution I came to was to create another dataframe (more below)...

        Date   Time   Ticker     Open     High      Low    Close
0     6/3/19   7:05  USD/JPY  108.370  108.370  108.345  108.345
1     6/3/19   7:10  USD/JPY  108.345  108.345  108.325  108.325
2     6/3/19   7:15  USD/JPY  108.330  108.360  108.330  108.340
3     6/3/19   7:20  USD/JPY  108.335  108.335  108.295  108.305
4     6/3/19   7:25  USD/JPY  108.305  108.305  108.270  108.305
5     6/3/19   7:30  USD/JPY  108.300  108.300  108.250  108.260
6     6/3/19   7:35  USD/JPY  108.265  108.295  108.265  108.290
7     6/3/19   7:40  USD/JPY  108.275  108.290  108.250  108.290
8     6/3/19   7:45  USD/JPY  108.285  108.290  108.275  108.290
9     6/3/19   7:50  USD/JPY  108.295  108.350  108.295  108.350
10    6/3/19   7:55  USD/JPY  108.355  108.355  108.325  108.330
11    6/3/19   8:00  USD/JPY  108.335  108.360  108.325  108.350

I tried the groupby function to write to a new database. First I tried to groupby Date with a max function written. This gave me the max and showed me the date....

       Date     High
0   6/10/19  108.670
1   6/11/19  108.800
2   6/12/19  108.545
3   6/13/19  108.535
4   6/14/19  108.500
5   6/17/19  108.690
6   6/18/19  108.675
7   6/19/19  108.495
8   6/20/19  107.760
9   6/21/19  107.735
10  6/24/19  107.530
11   6/3/19  108.445
12   6/4/19  108.355
13   6/5/19  108.340
14   6/6/19  108.330
15   6/7/19  108.500

But I want to also see the 'Time' row for when that max was on that date? How can I pass this in?

Example of desired output

Date       Time     High
6/10/19    9:05     108.670
6/11/19    11:35    108.800

'import pandas as pd

df = pd.read_csv("~/Downloads/file.csv", encoding = "ISO-8859-1")

High grouped by Date

df2 = df.groupby('Date', as_index= False)['High'].max() '

I've tried

'df2 = df.groupby('Date','Time' as_index= False)['High'].max()'

But receive this error......

df2 = df.groupby('Date','Time' as_index= False)['High'].max()
                                      ^

SyntaxError: invalid syntax

I would just like to have a dataframe where it shows Date, Time, High for when the max was in the high column for everyday.

      Date     High   TIME????????????????????
0   6/10/19  108.670
1   6/11/19  108.800
2   6/12/19  108.545
3   6/13/19  108.535
4   6/14/19  108.500
5   6/17/19  108.690
6   6/18/19  108.675
7   6/19/19  108.495
8   6/20/19  107.760
9   6/21/19  107.735
10  6/24/19  107.530
11   6/3/19  108.445
12   6/4/19  108.355
13   6/5/19  108.340
14   6/6/19  108.330
15   6/7/19  108.500

I changed the Date column a little bit for the illustration of the groupby function to the following:

      Date  Time   Ticker     Open     High      Low    Close
0   6/3/19  7:05  USD/JPY  108.370  108.370  108.345  108.345
1   6/3/19  7:10  USD/JPY  108.345  108.345  108.325  108.325
2   6/3/19  7:15  USD/JPY  108.330  108.360  108.330  108.340
3   6/4/19  7:20  USD/JPY  108.335  108.335  108.295  108.305
4   6/4/19  7:25  USD/JPY  108.305  108.305  108.270  108.305
5   6/4/19  7:30  USD/JPY  108.300  108.300  108.250  108.260
6   6/5/19  7:35  USD/JPY  108.265  108.295  108.265  108.290
7   6/5/19  7:40  USD/JPY  108.275  108.290  108.250  108.290
8   6/5/19  7:45  USD/JPY  108.285  108.290  108.275  108.290
9   6/6/19  7:50  USD/JPY  108.295  108.350  108.295  108.350
10  6/6/19  7:55  USD/JPY  108.355  108.355  108.325  108.330
11  6/6/19  8:00  USD/JPY  108.335  108.360  108.325  108.350

You could try:

df.loc[df.groupby('Date')['High'].idxmax()]

which will give you:

      Date  Time   Ticker     Open     High      Low    Close
0   6/3/19  7:05  USD/JPY  108.370  108.370  108.345  108.345
3   6/4/19  7:20  USD/JPY  108.335  108.335  108.295  108.305
6   6/5/19  7:35  USD/JPY  108.265  108.295  108.265  108.290
11  6/6/19  8:00  USD/JPY  108.335  108.360  108.325  108.350

Then drop any columns you don't want.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM