简体   繁体   中英

How to sort a single line of a Pandas data frame

I have the following data frame:

Client  Date        Value_1  Value_2  Value_3   Apple     Pear    Kiwi    Banana
ABC     2016-02-16  94       373      183       1739      38      19      73

The Client , Data , Value_1 and Value_2 column headers are static. But, the values in these columns can change.

The Apple , Pear , Kiwi and Banana column headers are dynamic. The values in these columns can change.

I'd like to be able to order the data frame such that the "color" columns (to the right of the "value" columns) are sorted highest to lowest, as follows:

Client  Date        Value_1  Value_2  Value_3   Apple     Banana   Pear     Kiwi
ABC     2016-02-16  94       373      183       1739      73       38       19

I tried the following code:

new_df = df.columns[5:].sort_values(ascending=False)

But, that just sorts the column headers themselves , not the values in those columns.

Does anyone know how to accomplish this?

Thanks!

You can use custom function:

cols = [col for col in df.columns if not col.startswith('Color')]
print (cols)
['Client', 'Date', 'Value_1', 'Value_2', 'Value_3']

def f(x):
    return pd.Series(x.sort_values(ascending=False).values, index=x.sort_values().index)

df = df.set_index(cols).apply(f, axis=1).reset_index()
print (df)
  Client        Date  Value_1  Value_2  Value_3  Color_3  Color_2  Color_4  \
0    ABC  2016-02-16       94      373      183     1739       73       38   

   Color_1  
0       19  

Another solution:

#select to Series all values from position 5
x = df.ix[0, 5:]
print (x)
Color_1    1739
Color_2      38
Color_3      19
Color_4      73
Name: 0, dtype: object

#create DataFrame with sorting values and index of Series x
a = pd.DataFrame([x.sort_values(ascending=False).values], columns=x.sort_values().index)
print (a)
   Color_3  Color_2  Color_4  Color_1
0     1739       73       38       19

#concat to original
df = pd.concat([df[df.columns[:5]], a], axis=1)
print (df)
  Client        Date  Value_1  Value_2  Value_3  Color_3  Color_2  Color_4  \
0    ABC  2016-02-16       94      373      183     1739       73       38   

   Color_1  
0       19  

EDIT byu changed question:

x = df.ix[:, 5:].sort_values(by=0, ascending=False, axis=1)
print (x)
   Apple  Banana  Pear  Kiwi
0   1739      73    38    19

df = pd.concat([df.ix[:, :5], x], axis=1)
print (df)
  Client        Date  Value_1  Value_2  Value_3  Apple  Banana  Pear  Kiwi
0    ABC  2016-02-16       94      373      183   1739      73    38    19

You can also use numpy to sort them.

import pandas as pd, numpy as np

# Set up the test data
df = pd.DataFrame(np.ceil(np.random.rand(1,10)*1000))
values = ["Value_"+str(i) for i in range(5)] 
colors = ["Color_"+str(i) for i in range(5)]
df.columns = values + colors

# Order
idx = np.argsort(df[df.columns[5:]].values)[0]
# Reverse (descending order)
ridx = idx[::-1]

df[df.columns[5:][ridx]]

You need to create a new order for your columns:

order = list(df.columns[:4]) + \
        list(zip(*sorted([(i, int(df[i])) for i in df.columns[4:]], key=lambda x: x[1], reverse=True))[0])

Here the column names are zip ed with the column values, and a sort is then applied. The zip(*[]) unpacks the sorted list and the column names are kept. Then apply this to your dataframe:

print df[order]

>>> Date     Value_1  Value_2  Value_3  Color_2  Color_1  Color_3  Color_4
0  ABC  2016-02-16       94      373     1739      183       38       19

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM