For example, I have dataframe program like:
lst3 = [
['it store', ['asus', 'acer', 'hp', 'dell'], [50000, 30000, 20000, 10000]],
['mz store', ['acer', 'dell'], [60000, 75000]],
['bm shop', ['hp', 'acer', 'asus'], [45000, 15000, 30000]]
]
df3 = pd.DataFrame(lst3, columns =['store_name', 'item', 'price'], dtype = float)
print(df3)
And the result is:
store_name item price
0 it store [asus, acer, hp, dell] [50000, 30000, 20000, 10000]
1 mz store [acer, dell] [60000, 75000]
2 bm shop [hp, acer, asus] [45000, 15000, 30000]
The type of column 'item' and 'price' are list.
So, for example I wanna sort the dataframe by the lowest price of item 'acer'. The expected result is:
store_name item price
2 bm shop [hp, acer, asus] [45000, 15000, 30000]
0 it store [asus, acer, hp, dell] [50000, 30000, 20000, 10000]
1 mz store [acer, dell] [60000, 75000]
[edit: additional] And, if sort the dataframe by the lowest price of item 'hp', the expected result is:
store_name item price
0 it store [asus, acer, hp, dell] [50000, 30000, 20000, 10000]
2 bm shop [hp, acer, asus] [45000, 15000, 30000]
Could you help me, how about the program script to make the result like above in Python?
One of the solutions is to convert the DataFrame
to records using to_records()
method.
Sort it using python's builtin sorted()
function.
Then convert back it to DataFrame
using from_records()
.
For your current DataFrame
to sort price by minimum in the list, you can do following.
sorted_records = sorted(df3.to_records(), key=lambda x: min(x[3]))
df3 = pd.DataFrame.from_records(sorted_records)
Keep in track of the index of the column you are trying to sort from when converted to records.
It seems that the DataFrame does not contain an easy way to sort by specific-user-defined keys. so you can just create a translation to list and sort it as you wish like so:
def sort_by_product(df3, product):
def get_product_price(current_store):
current_product = product
return current_store[2][current_store[1].index(current_product)]
sorted_list = sorted(df3.values.tolist(), key=get_product_price)
return pd.DataFrame(sorted_list , columns =['store_name', 'item', 'price'], dtype = float)
usage example:
sort_by_product(df3, "acer")
Which outputs:
store_name item price
0 bm shop [hp, acer, asus] [45000, 15000, 30000]
1 it store [asus, acer, hp, dell] [50000, 30000, 20000, 10000]
2 mz store [acer, dell] [60000, 75000]
Hope that helped
This will work only if all the list in column item contains the string acer
import pandas as pd
lst3 = [
['it store', ['asus', 'acer', 'hp', 'dell'], [50000, 30000, 20000, 10000]],
['mz store', ['acer', 'dell'], [60000, 75000]],
['bm shop', ['hp', 'acer', 'asus'], [45000, 15000, 30000]]
]
df3 = pd.DataFrame(lst3, columns =['store_name', 'item', 'price'])
df3['new'] = df3['item'].apply(lambda x: x.index('acer'))
def f(x):
return(x[2][x[3]])
df3['new']=df3.apply(f,axis=1)
df3.sort_values(by=['new'], inplace=True)
df3.drop(['new'], axis=1, inplace=True)
df3.reset_index(drop=True, inplace=True)
df3
The output is as follows:
store_name item price
0 bm shop [hp, acer, asus] [45000, 15000, 30000]
1 it store [asus, acer, hp, dell] [50000, 30000, 20000, 10000]
2 mz store [acer, dell] [60000, 75000]
I hope this does the work!
You could put whatever computer brand you want to replace 'acer'
from more_itertools import roundrobin as rb
lst3 = [
['it store', ['asus', 'acer', 'hp', 'dell'], [50000, 30000, 20000, 10000]],
['mz store', ['acer', 'dell'], [60000, 75000]],
['bm shop', ['hp', 'acer', 'asus'], [45000, 15000, 30000]]
]
d2 = {}
for k,v in {e[0] : list(rb(e[1], e[2])) for e in lst3}.items():
try:
d2[k]=v[v.index('acer')+1]
except:
continue
ord_lst3 = []
for shop in sorted(d2):
ord_lst3 += list(filter(lambda e: e[0] == shop, lst3))
print(ord_lst3)
# [['bm shop', ['hp', 'acer', 'asus'], [45000, 15000, 30000]],
# ['it store', ['asus', 'acer', 'hp', 'dell'], [50000, 30000, 20000, 10000]],
# ['mz store', ['acer', 'dell'], [60000, 75000]]]
Summary:
item
and price
are related ( item
holds acer
, the index of acer
in item
is directly related to its price
in the price
column). so we need to find a way to pair them.
get the index of acer
in item
column, get its corresponding price
in the price
column, sort from smallest to biggest, get the indices, and use that index to reindex the dataframe:
from operator import itemgetter
#use enumerate to get the numbers attached
#we could also zip the index instead
sorter = sorted([(num,price[item.index('acer')])
for num, (item,price)
in enumerate(zip(df3.item,df3.price))]
,key=itemgetter(1))
#extract only the first item from each tuple in the sorter list
new_index = [first for first,last in sorter]
#reindex dataframe to get our sorted form
df3.reindex(new_index)
store_name item price
2 bm shop [hp, acer, asus] [45000, 15000, 30000]
0 it store [asus, acer, hp, dell] [50000, 30000, 20000, 10000]
1 mz store [acer, dell] [60000, 75000]
IIUC, Series.str.index
and DataFrame.lookup
indexes = df3['item'].str.index('acer')
df = pd.DataFrame(df3['price'].tolist())
(df3.assign(acer_value = df.lookup(df.index , indexes))
.sort_values('acer_value')
.drop(columns='acer_value'))
store_name item price
2 bm shop [hp, acer, asus] [45000, 15000, 30000]
0 it store [asus, acer, hp, dell] [50000, 30000, 20000, 10000]
1 mz store [acer, dell] [60000, 75000]
Or:
order = (df3.assign(indexes = df3['item'].str.index('acer'))
.apply(lambda x: x['price'][x['indexes']], axis=1)
.sort_values().index)
df3.loc[order]
It seems that the DataFrame does not contain an easy way to sort by specific-user-defined keys. so you can just create a translation to list and sort it as you wish like so:
def sort_by_product(df3, product):
def get_product_price(current_store):
current_product = product
return current_store[2][current_store[1].index(current_product)]
sorted_list = sorted(df3.values.tolist(), key=get_product_price)
return pd.DataFrame(sorted_list , columns =['store_name', 'item', 'price'], dtype = float)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.