[英]subset pandas DataFrame and split into 3 DataFrames
How can I subset the pandas DataFrame by the values in one column? 如何通过一列中的值将pandas DataFrame子集化? For example, I want to separate the dataset below by the names of each Company.
例如,我想按每个公司的名称分隔下面的数据集。
So I want to split the keywords
data frame into 3 different data frames. 因此,我想将
keywords
数据帧分为3个不同的数据帧。 I tried to def a function that would split the dataset by the name value in the column and then ran a for loop on the column for the function. 我试图定义一个函数,该函数将按列中的名称值拆分数据集,然后在该函数的列上运行一个for循环。 However, it doesn't seem to work.
但是,它似乎不起作用。 Anyone know how I can accomplish this?
有人知道我该怎么做吗?
keywords = {'Company':['amazon', 'amazon', 'amazon', 'target' 'target' 'target', 'walmart', 'walmart', 'walmart'],
'keywords':['abc', 'def', 'ghi', 'jkl', 'mno', 'pqr', 'rst', 'uvw', 'xyz'],
'type':['article', 'blog', 'news', 'article', 'blog', 'news', 'article', 'blog', 'news']}
def key(name):
key = keywords.loc[name, :]
return
for h in keywords['Company']:
key(h)
the following assumes df
is a dataframe loaded with your keywords data 以下假设
df
是加载了关键字数据的数据框
amazon_df = df.query('Company == "amazon"')
this will return a new dataframe where the company column matches the string 'amazon'. 这将返回一个新的数据框,其中company列与字符串'amazon'相匹配。 To pass in a variable to
df.query
string you prepand an @
symbol to the variable name. 要将变量传递到
df.query
字符串,请在变量名称前添加@
符号。 pandas.DataFrame.query docs pandas.DataFrame.query docs
for example: 例如:
def get_subset_df(df, company_name):
return df.query('Company == @company_name')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.