简体   繁体   中英

I would like to return the maximum value of a column

MY DATA I have 3 columns that contain wind speed and direction and a time index. I wanted to return the daily maximum the wind with the time and the specific wind direction of that time. I used the command below:

df['max_day']=df.wind.resample('1D').max()

but he always returned to me at 00:00

Here's a sample of the data:

time    vento10m_azul   dir
2019-01-01 1:00:00  7.4527917   84.17657707
2019-01-01 2:00:00  7.571505    82.76253884
2019-01-01 3:00:00  7.529691    78.80457605
2019-01-01 4:00:00  7.2273316   76.08609884
2019-01-01 5:00:00  6.985468    75.99220721
2019-01-02 0:00:00  5.5748515   76.23670838
2019-01-02 1:00:00  5.1289306   66.44264187
2019-01-02 2:00:00  4.63257 57.76554662
2019-01-02 3:00:00  4.036444    48.3211454
2019-01-02 4:00:00  3.26109 47.26135372
2019-01-02 5:00:00  2.6211443   53.60521783

A fuller one month sample is in this link:

https://drive.google.com/open?id=133E7xA3h5StVjlgVqqnfwFmRTFR2HcUE

First, load the CSV file and convert the time field to a datetime format:

import pandas as pd
df = pd.read_csv("my_date.csv")
df["time"] = pd.to_datetime(df.time)

Next, calculate the maximum speed for each day by grouping the data by date, taking the max, and renaming columns appropriately:

max_speed = df.groupby(df.time.dt.date)["vento10m_azul"].max().reset_index().rename(columns={"time": "date", "vento10m_azul": "max_vento10m_azul"})

Finally, merge the dataframe containing maximum speed information with the original dataframe containing all the wind speed data. Keep only the rows with values equal to the maximum, and drop other unnecessary columns.

df["date"] = df.time.dt.date
df_x = df.merge(max_speed, on="date")
df_x = df_x[df_x["vento10m_azul"] == df_x["max_vento10m_azul"]]
df_x = df_x[["time", "vento10m_azul"]]

Try doing this:

df['time'] = pd.to_datetime(df['time'])
df = df.iloc[df.groupby(pd.Grouper(key='time', freq='1D'))['vento10m_azul'].idxmax()]
df['time'] = df['time'].dt.date
df = df.reset_index().drop(columns=['index'])
print(df)

Output:

         time  vento10m_azul         dir
0   2019-01-01       7.571505   82.762539
1   2019-01-02       6.582745   43.261218
2   2019-01-03       7.914436   26.962216
3   2019-01-04       8.309497  354.637982
4   2019-01-05       9.034869  143.472224
5   2019-01-06       6.909633  113.542660
6   2019-01-07       8.210649   23.854406
7   2019-01-08       8.628985   29.572357
8   2019-01-09       9.898343   64.477980
9   2019-01-10      10.570002   49.819634
10  2019-01-11       5.311725   27.333261
11  2019-01-12       4.922985   79.928011
12  2019-01-13       7.385470   63.877019
13  2019-01-14       8.799546   40.721517
14  2019-01-15       7.766147   51.942357
15  2019-01-16       8.430967  295.331752
16  2019-01-17       7.590732    4.340045
17  2019-01-18       5.254148   96.465752
18  2019-01-19       4.975754   13.093988
19  2019-01-20       8.721619  178.418132
20  2019-01-21       2.412958   78.999404
21  2019-01-22       7.567795  127.181465
22  2019-01-23       6.668825  106.142476
23  2019-01-24       7.524504  142.564668
24  2019-01-25       7.676533   52.388050
25  2019-01-26       7.374160   47.992977
26  2019-01-27      10.085866   45.983522
27  2019-01-28       8.340270   50.408780
28  2019-01-29       6.613598   61.931717
29  2019-01-30       6.229586   58.925196
30  2019-01-31       5.741903   47.251849

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM