I have the following csv
id;price;editor
k1;10,00;ed1
k1;8,00;ed2
k3;10,00;ed1
k3;11,00;ed2
k2;10,50;ed1
k1;9,50;ed3
If I do the following
import pandas as pd
df = pd.read_csv('Testing.csv', delimiter =';')
df_reduced= df.groupby(['id', 'editor'])['price'].min()
Instead of getting
k1;8,00;ed2
k2;10,50;ed1
k3;10,00;ed1
I get
k1;10,00;ed1
8,00;ed2
9,50;ed3
k2;10,50;ed1
k3;10,00;ed1
11,00;ed2
So can I get three id's with their minimum values?
Group the data by only id and find min price for each group. Index the original dataframe based on the minimum values to include the editor column.
Note: I am assuming that the comma in price column is a typo
df.loc[df['price'] == df.groupby('id')['price'].transform('min')]
id price editor
1 k1 8.0 ed2
2 k3 10.0 ed1
4 k2 10.5 ed1
drop_duplicate
+ sort_values
#df['price'] = pd.to_numeric(df['price'].str.replace(",", "."))
df.sort_values('price').drop_duplicates(['id'])
Out[423]:
id price editor
1 k1 8.0 ed2
2 k3 10.0 ed1
4 k2 10.5 ed1
Much like @Wen-Ben I choose to use sort_values
and drop_duplicates
, however, I converted the values using pd.read_csv
with the decimal
parameter.
from io import StringIO
csvfile = StringIO("""id;price;editor
k1;10,00;ed1
k1;8,00;ed2
k3;10,00;ed1
k3;11,00;ed2
k2;10,50;ed1
k1;9,50;ed3""")
df = pd.read_csv(csvfile, delimiter =';', decimal=',')
df.sort_values(['id','price']).drop_duplicates(['id'])
Output:
id price editor
1 k1 8.0 ed2
4 k2 10.5 ed1
2 k3 10.0 ed1
The instruction
df_reduced= df.groupby(['id', 'editor'])['price'].min()
will give you the min price per each unique id-editor pair, you want the min per id. However, since your price field has a string format, you first need to cast it to numeric in order to run the groupby:
df['price'] = pd.to_numeric(df1['price'].str.replace(",", "."))
df.loc[df.groupby('id')['price'].idxmin()]
Output
id price editor
1 k1 8.0 ed2
4 k2 10.5 ed1
2 k3 10.0 ed1
get rid of the editor part:
df_reduced= df.groupby(['id'])['price'].min()
no need to include 'transformed' as somebody else stated
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.