简体   繁体   English

子集pandas DataFrame,并分为3个DataFrame

[英]subset pandas DataFrame and split into 3 DataFrames

How can I subset the pandas DataFrame by the values in one column? 如何通过一列中的值将pandas DataFrame子集化? For example, I want to separate the dataset below by the names of each Company. 例如,我想按每个公司的名称分隔下面的数据集。

So I want to split the keywords data frame into 3 different data frames. 因此,我想将keywords数据帧分为3个不同的数据帧。 I tried to def a function that would split the dataset by the name value in the column and then ran a for loop on the column for the function. 我试图定义一个函数,该函数将按列中的名称值拆分数据集,然后在该函数的列上运行一个for循环。 However, it doesn't seem to work. 但是,它似乎不起作用。 Anyone know how I can accomplish this? 有人知道我该怎么做吗?


keywords = {'Company':['amazon', 'amazon', 'amazon', 'target' 'target' 'target', 'walmart', 'walmart', 'walmart'], 
'keywords':['abc', 'def', 'ghi', 'jkl', 'mno', 'pqr', 'rst', 'uvw', 'xyz'], 
'type':['article', 'blog', 'news', 'article', 'blog', 'news', 'article', 'blog', 'news']}

def key(name):
    key = keywords.loc[name, :]
    return

for h in keywords['Company']:
    key(h)

the following assumes df is a dataframe loaded with your keywords data 以下假设df是加载了关键字数据的数据框

amazon_df = df.query('Company == "amazon"')

this will return a new dataframe where the company column matches the string 'amazon'. 这将返回一个新的数据框,其中company列与字符串'amazon'相匹配。 To pass in a variable to df.query string you prepand an @ symbol to the variable name. 要将变量传递到df.query字符串,请在变量名称前添加@符号。 pandas.DataFrame.query docs pandas.DataFrame.query docs

for example: 例如:

def get_subset_df(df, company_name):
    return df.query('Company == @company_name')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM