简体   繁体   English

Pandas - Groupby并创建新的DataFrame?

[英]Pandas - Groupby and create new DataFrame?

This is my situation - 这是我的情况 -

In[1]: data
Out[1]: 
     Item                    Type
0  Orange           Edible, Fruit
1  Banana           Edible, Fruit
2  Tomato       Edible, Vegetable
3  Laptop  Non Edible, Electronic

In[2]: type(data)
Out[2]: pandas.core.frame.DataFrame

What I want to do is create a data frame of only Fruits , so I need to groupby such a way that Fruit exists in Type . 我想要做的就是创建只有一个数据帧Fruits ,所以我需要groupby这样一种方式, Fruit中存在的Type

I've tried doing this: 我试过这样做:

grouped = data.groupby(lambda x: "Fruit" in x, axis=1)

I don't know if that's the way of doing it, I'm having a little tough time understanding groupby . 我不知道这是不是这样做,我有点难以理解groupby How do I get a new DataFrame of only Fruits ? 如何获得只有Fruits的新DataFrame

You could use 你可以用

data[data['Type'].str.contains('Fruit')]

import pandas as pd

data = pd.DataFrame({'Item':['Orange', 'Banana', 'Tomato', 'Laptop'],
                     'Type':['Edible, Fruit', 'Edible, Fruit', 'Edible, Vegetable', 'Non Edible, Electronic']})
print(data[data['Type'].str.contains('Fruit')])

yields 产量

     Item           Type
0  Orange  Edible, Fruit
1  Banana  Edible, Fruit

groupby does something else entirely. groupby完全做了别的事。 It creates groups for aggregation. 它创建聚合组。 Basically, it goes from something like: 基本上,它来自:

['a', 'b', 'a', 'c', 'b', 'b']

to something like: 类似于:

[['a', 'a'], ['b', 'b', 'b'], ['c']]

What you want is df.apply . 你想要的是df.apply

In newer versions of pandas there's a query method that makes this a bit more efficient and easier. 在较新版本的pandas有一种query方法可以使它更有效,更容易。

However, one what of doing what you want is to make a boolean array by using 但是,做你想做的事就是使用一个布尔数组

mask = df.Type.apply(lambda x: 'Fruit' in x)

And then selecting the relevant portions of the data frame with df[mask] . 然后用df[mask]选择数据帧的相关部分。 Or, as a one-liner: 或者,作为一个单行:

df[df.Type.apply(lambda x: 'Fruit' in x)]

As a full example: 作为一个完整的例子:

import pandas as pd
data = [['Orange', 'Edible, Fruit'],
        ['Banana', 'Edible, Fruit'],
        ['Tomato', 'Edible, Vegtable'],
        ['Laptop', 'Non Edible, Electronic']]
df = pd.DataFrame(data, columns=['Item', 'Type'])

print df[df.Type.apply(lambda x: 'Fruit' in x)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 pandas 中按 groupby 条件创建新的 dataframe - Create new dataframe with condtion per groupby in pandas 如何在groupby pandas dataFrame中创建具有条件计数的新列 - How to create a new column with a conditional count in a groupby pandas dataFrame Pandas - 从 groupby 操作的前 n 组创建新的 DataFrame - Pandas - create a new DataFrame from first n groups of a groupby operation 如何基于groupby,pandas DataFrame创建新的词典列? - How to create a new column of dictionaries based on groupby, pandas DataFrame? 从Pandas中的groupby .agg()或.apply()有效地创建全新的数据帧? - Create entirely new dataframe efficiently from groupby .agg() or .apply() in Pandas? 熊猫groupby存储在新数据框中 - Pandas groupby stored in a new dataframe 熊猫DataFrame Groupby和总和成新DataFrame - Pandas DataFrame Groupby and Sum Into New DataFrame 在 pandas 中使用 groupby function,我如何创建新的 dataframe 列来保存每个 groupby“级别”的总和 - Using the groupby function in pandas, how can I create new dataframe columns that hold sums for each groupby "level" 熊猫数据框groupby + Apply +新列很慢 - Pandas dataframe groupby + apply + new column is slow Pandas:将 groupby 的结果分配给数据框到新列 - Pandas : Assign result of groupby to dataframe to a new column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM