简体   繁体   中英

Pandas: How to sort dataframe rows by date of one column

So I have two different data-frame and I concatenated both. All columns are the same; however, the date column has all sorts of different dates in the M/D/YR format.

在此输入图像描述 dataframe dates get shuffled around later in the sequence

Is there a way to keep the whole dataframe itself and just sort the rows based on the dates in the date column. I also want to keep the format that date is in.

so basically

date        people
6/8/2015    1
7/10/2018   2
6/5/2015    0

gets converted into:

date          people
6/5/2015      0
6/8/2015      1
7/10/2018     2

Thank you!

PS: I've tried the options in the other post on this but it does not work

Trying to elaborate on what can be done: Intialize/ Merge the dataframe and convert the column into datetime type

df= pd.DataFrame({'people':[1,2,0],'date': ['6/8/2015','7/10/2018','6/5/2015',]})
df.date=pd.to_datetime(df.date,format="%m/%d/%Y")
print(df)

Output:

   date      people
0   2015-06-08  1
1   2018-07-10  2
2   2015-06-05  0

Sort on the basis of date

df=df.sort_values('date')
print(df)

Output:

    date    people
2   2015-06-05  0
0   2015-06-08  1
1   2018-07-10  2

Maintain the format again:

df['date']=df['date'].dt.strftime('%m/%d/%Y')
print(df)

Output:

    date    people
2   06/05/2015  0
0   06/08/2015  1
1   07/10/2018  2

Try changing the 'date' column to pandas Datetime and then sort

import pandas as pd
df= pd.DataFrame({'people':[1,1,1,2],'date': 
['4/12/1961','5/5/1961','7/21/1961','8/6/1961']})
df['date'] =pd.to_datetime(df.date)
df.sort_values(by='date')

Output:

date       people

1961-04-12  1

1961-05-05  1

1961-07-21  1

1961-08-06  2

To get back the initial format:

df['date']=df['date'].dt.strftime('%m/%d/%y')

Output:

date    people
04/12/61    1

05/05/61    1

07/21/61    1

08/06/61    2

Not sure exactly what you want to get but if you just want to get people who belong to one date just simply use groupby .

df = df.groupby('date').sum()

or different groupby

df = df.groupby('date').agg(lambda col: col.tolist()).reset_index()

Then you can sort it as you want. Maybe this is going to be what you're looking for Sort Pandas Dataframe by Date

why not simply?

dataset[SortBy["date"]]

can you provide what you tried or how is your structure?

In case you need to sort in reversed order do:

dataset[SortBy["date"]][Reverse]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM