I have the following data frame:
Client Date Value_1 Value_2 Value_3 Apple Pear Kiwi Banana
ABC 2016-02-16 94 373 183 1739 38 19 73
The Client
, Data
, Value_1
and Value_2
column headers are static. But, the values in these columns can change.
The Apple
, Pear
, Kiwi
and Banana
column headers are dynamic. The values in these columns can change.
I'd like to be able to order the data frame such that the "color" columns (to the right of the "value" columns) are sorted highest to lowest, as follows:
Client Date Value_1 Value_2 Value_3 Apple Banana Pear Kiwi
ABC 2016-02-16 94 373 183 1739 73 38 19
I tried the following code:
new_df = df.columns[5:].sort_values(ascending=False)
But, that just sorts the column headers themselves , not the values in those columns.
Does anyone know how to accomplish this?
Thanks!
You can use custom function:
cols = [col for col in df.columns if not col.startswith('Color')]
print (cols)
['Client', 'Date', 'Value_1', 'Value_2', 'Value_3']
def f(x):
return pd.Series(x.sort_values(ascending=False).values, index=x.sort_values().index)
df = df.set_index(cols).apply(f, axis=1).reset_index()
print (df)
Client Date Value_1 Value_2 Value_3 Color_3 Color_2 Color_4 \
0 ABC 2016-02-16 94 373 183 1739 73 38
Color_1
0 19
Another solution:
#select to Series all values from position 5
x = df.ix[0, 5:]
print (x)
Color_1 1739
Color_2 38
Color_3 19
Color_4 73
Name: 0, dtype: object
#create DataFrame with sorting values and index of Series x
a = pd.DataFrame([x.sort_values(ascending=False).values], columns=x.sort_values().index)
print (a)
Color_3 Color_2 Color_4 Color_1
0 1739 73 38 19
#concat to original
df = pd.concat([df[df.columns[:5]], a], axis=1)
print (df)
Client Date Value_1 Value_2 Value_3 Color_3 Color_2 Color_4 \
0 ABC 2016-02-16 94 373 183 1739 73 38
Color_1
0 19
EDIT byu changed question:
x = df.ix[:, 5:].sort_values(by=0, ascending=False, axis=1)
print (x)
Apple Banana Pear Kiwi
0 1739 73 38 19
df = pd.concat([df.ix[:, :5], x], axis=1)
print (df)
Client Date Value_1 Value_2 Value_3 Apple Banana Pear Kiwi
0 ABC 2016-02-16 94 373 183 1739 73 38 19
You can also use numpy to sort them.
import pandas as pd, numpy as np
# Set up the test data
df = pd.DataFrame(np.ceil(np.random.rand(1,10)*1000))
values = ["Value_"+str(i) for i in range(5)]
colors = ["Color_"+str(i) for i in range(5)]
df.columns = values + colors
# Order
idx = np.argsort(df[df.columns[5:]].values)[0]
# Reverse (descending order)
ridx = idx[::-1]
df[df.columns[5:][ridx]]
You need to create a new order for your columns:
order = list(df.columns[:4]) + \
list(zip(*sorted([(i, int(df[i])) for i in df.columns[4:]], key=lambda x: x[1], reverse=True))[0])
Here the column names are zip
ed with the column values, and a sort is then applied. The zip(*[])
unpacks the sorted list and the column names are kept. Then apply this to your dataframe:
print df[order]
>>> Date Value_1 Value_2 Value_3 Color_2 Color_1 Color_3 Color_4
0 ABC 2016-02-16 94 373 1739 183 38 19
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.