I am observing some random sort results for a dataframe that I intend to sort by dates in ascending order. For multiple runs, most of the runs returns the correct results but for a small number of runs, it returns an incorrect results.
records_df = records_df.groupby(['YEAR','QUARTER','SUPPLIER_ID']).TRANSACTION_DATES.agg({'TRANSACTION_DATES' : lambda x: list(x.unique())}).reset_index()
# This now sorts in date order
records_df.sort_values(by=['TRANSACTION_DATES'])
For most runs: TRANSACTION_DATES: [05-Sep-17, 06-Sep-17, 07-Sep-17]
For some runs: Incorrect results is seen:
TRANSACTION_DATES: [06-Sep-17, 07-Sep-17, 05-Sep-17]
Why is that so since I am already enforcing a sort using sort_values?
I think your problem is that you are using sort_values without assigning or using the inplace argument. This means that your sorted dataframe is just disappearing and is not stored anywhere.
So try:
records_df = records_df.sort_values(by=['TRANSACTION_DATES'])
or
records_df.sort_values(by=['TRANSACTION_DATES'], inplace=True)
For reference, the sort_values docs:
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort_values.html
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.