简体   繁体   English

for 循环中的熊猫操作

[英]pandas operations inside a for-loop

Here is a sample of my data这是我的数据示例

threats =
                                  binomial_name
    continent        threat_type    
    Africa  Agriculture & Aquaculture   143
              Biological Resource Use   102
                    Climate Change       3
               Commercial Development   36
           Energy Production & Mining   30
    ... ... ...
    South America   Human Intrusions    1
                     Invasive Species   3
        Natural System Modifications    1
              Transportation Corridor   2
                             Unknown    38

I want to use a for loop and obtain and append together the top 5 values of each continent into a data frame.我想使用 for 循环并获取每个大陆的前 5 个值并将其附加到一个数据框中。 Here is my code -这是我的代码 -

continents = threats.continent.unique()

for i in continents:
    continen = (threats
           .query('continent == i')
           .groupby(['continent','threat_type'])
           .sort_values(by=('binomial_name'), ascending=False).
           .head())

    top5 = appended_data.append(continen)    

I am however getting the error - KeyError: 'i'然而,我收到错误 - KeyError: 'i'

Where am I going wrong?我哪里错了?

So, the canonical way to do this:因此,执行此操作的规范方法是:

df.groupby('continent', as_index=False).apply(
    lambda grp: grp.nlargest(5, 'binomial_value'))

If you want to do this in a loop, replace this part:如果要循环执行此操作,请替换此部分:

for i in continents: 
    continen = threats[threats['continent'] == i].nlargest(2, 'binomial_name') 
    appended_data.append(continen) 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM