简体   繁体   English

pandas.core.groupby.DataFrameGroupBy.idxmin() 很慢,我怎样才能让我的代码更快?

[英]pandas.core.groupby.DataFrameGroupBy.idxmin() is very slow , how can i make my code faster?

i am trying to do same action as SQL group by and take min value:我正在尝试执行与 SQL group by 相同的操作并取最小值:

select id,min(value) ,other_fields...
from table
group by ('id')

i tried:我试过:

dfg = df.groupby('id', sort=False)
idx = dfg['value'].idxmin()
df = df.loc[idx, list(df.columns.values)]

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.DataFrameGroupBy.idxmin.html but line 2 the idxmin() is taking more than half hour on ~4M columns in df where the group by takes less than 1 second, what am i missing is it suppose to take that long? https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.core.groupby.DataFrameGroupBy.idxmin.html但是第 2 行 idxmin() 在 df 中的 ~4M 列上花费了半个多小时group by 花费不到 1 秒的地方,我想念的是它应该花那么长时间吗? how can make this process faster?如何使这个过程更快? will it be faster in pure SQL?在纯 SQL 中会更快吗?

use alternative with DataFrame.sort_values and DataFrame.drop_duplicates :使用替代DataFrame.sort_valuesDataFrame.drop_duplicates

df1 = df.sort_values(by=['value']).drop_duplicates('id', keep='first')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我的代码运行很慢,我怎样才能让它更快 - My code run very slow, how can I make it faster 熊猫:如何使我的代码运行更快? - Pandas: How can I make my code run faster? 我的 Viber 机器人工作很慢(Python)我怎样才能让它更快 - My Viber bot working very slow(Python) How can I make it faster 如何以相反的顺序在 pandas.core.groupby.DataFrameGroupBy 上使用 for 循环? - How to use for loop on pandas.core.groupby.DataFrameGroupBy in reverse order? 熊猫:DataFrame.mean() 非常慢。 如何更快地计算列的均值? - pandas: DataFrame.mean() very slow. How can I calculate means of columns faster? 具有 2 列的 Groupby - “pandas.core.groupby.generic.DataFrameGroupBy” 终端响应 - Groupby with 2 colums - “pandas.core.groupby.generic.DataFrameGroupBy” terminal response 蟒蛇熊猫<pandas.core.groupby.DataFrameGroupBy object at ...> - Python Pandas <pandas.core.groupby.DataFrameGroupBy object at ...> 从 pandas.core.groupby.generic.DataFrameGroupBy 中删除空的 dataframe - Remove empty dataframe from pandas.core.groupby.generic.DataFrameGroupBy 从 pandas.core.groupby.generic.DataFrameGroupBy object 获取值 - Get values from pandas.core.groupby.generic.DataFrameGroupBy object 我怎样才能使我的python代码运行得更快 - How can I make my python code run faster
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM