![](/img/trans.png)
[英]Pandas nlargest with NaN inside my data return more than n rows of data
[英]pandas nlargest is returning more than n rows
我有一個看起來像這樣的DataFrame
:
name value
date
2016-05-01 kelly 20
2016-05-05 john 12
2016-05-05 sarah 25
2016-05-05 george 3
2016-05-05 tom 40
2016-05-07 kara 24
2016-05-07 jane 90
2016-05-07 sally 39
2016-05-07 sam 28
我想最好按日期獲取前3行(根據值)。 我期待這樣的事情:
name value
date
2016-05-01 kelly 20
2016-05-05 john 12
2016-05-05 sarah 25
2016-05-05 tom 40
2016-05-07 jane 90
2016-05-07 sally 39
2016-05-07 sam 28
但我也可以:
name value
date
2016-05-05 tom 40
2016-05-07 jane 90
2016-05-07 sally 39
我嘗試了df.nlargest(3, 'value')
但是得到了這個奇怪的結果:
name value
date
2016-05-01 kelly 20
2016-05-01 kelly 20
2016-05-01 kelly 20
2016-05-05 tom 40
2016-05-05 tom 40
2016-05-05 tom 40
2016-05-05 sarah 25
2016-05-05 sarah 25
2016-05-05 sarah 25
2016-05-07 kara 24
2016-05-07 kara 24
...
2016-05-07 sally 39
2016-05-07 sally 39
2016-05-07 jane 90
2016-05-07 jane 90
2016-05-07 jane 90
我嘗試每天運行它:
[df.ix[day].nlargest(3, 'value') for day in df.index.unique()]
但是我遇到了同樣的問題(每個名字重復了3次)
首先,這將完成工作:
df.sort_values('value', ascending=False).groupby(level=0).head(3).sort_index()
[:n]
sort_values()
結果切片 在降序模式下使用sort_values()
,並在切片中獲取前n
結果 ,然后使用sort_index()
來使天數單調增加 。
import pandas as pd
import cStringIO
df = pd.read_table(cStringIO.StringIO('''
date name value
2016-05-01 kelly 20
2016-05-05 john 12
2016-05-05 sarah 25
2016-05-05 george 3
2016-05-05 tom 40
2016-05-07 kara 24
2016-05-07 jane 90
2016-05-07 sally 39
2016-05-07 sam 28
'''), sep=' *', index_col=0, engine='python')
print 'Original DataFrame:'
print df
print
df_top3 = df.sort_values('value', ascending=False)[:3].sort_index()
print 'Top 3 Largest value DataFrame:'
print df_top3
print
Original DataFrame:
name value
date
2016-05-01 kelly 20
2016-05-05 john 12
2016-05-05 sarah 25
2016-05-05 george 3
2016-05-05 tom 40
2016-05-07 kara 24
2016-05-07 jane 90
2016-05-07 sally 39
2016-05-07 sam 28
Top 3 Largest value DataFrame:
name value
date
2016-05-05 tom 40
2016-05-07 jane 90
2016-05-07 sally 39
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.