简体   繁体   English

Python Pandas:有没有办法根据列表中的字符串获取子集dataframe

[英]Python Pandas: Is there a way to obtain a subset dataframe based on strings in a list

I am looking to make a subset df based on the string values in a list.我正在寻找基于列表中的字符串值的子集 df 。

A toy model example:以玩具 model 为例:

data = {'month': ['January','February','March','April','May','June','July','August','September','October','November','December'],
        'days_in_month': [31,28,31,30,31,30,31,31,30,31,30,31]
        }

df = pd.DataFrame(data, columns = ['month', 'days_in_month'])

summer_months = ['Dec', 'Jan', 'Feb']

contain_values = df[df['month'].str.contains(summer_months)] 
print (df)

This would fail because of contain_values = df[df['month'].str.contains(summer_months)]这会因为contain_values = df[df['month'].str.contains(summer_months)]

TypeError: unhashable type: 'list'

I know that contain_values = df[df['month'].str.contains('Dec')] works but I would like to return the new dataframe with the summer months in it.我知道contain_values = df[df['month'].str.contains('Dec')]有效,但我想返回带有夏季月份的新dataframe Or even all the none summer months using the ~ function.甚至使用~的所有非夏季月份。

Thanks谢谢

>>> contain_values = df[df['month'].str.contains('|'.join(summer_months))]

>>> contain_values
       month  days_in_month
0    January             31
1   February             28
11  December             31

You can as well using what .str offers you:您也可以使用.str为您提供的内容:

df[df["month"].str[:3].isin(summer_months)]

OUTPUT OUTPUT

       month  days_in_month
0    January             31
1   February             28
11  December             31

You can make it more robust using something like this (in case names in the dataframe are not properly capitalized):您可以使用类似这样的方法使其更健壮(如果 dataframe 中的名称未正确大写):

df[df["month"].str.capitalize().str[:3]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据列表中的部分字符串过滤pandas(python)数据帧 - Filter pandas (python) dataframe based on partial strings in a list 熊猫:基于包含某些值的字符串有效地对DataFrame进行子集化 - Pandas: Efficiently subset DataFrame based on strings containing certain values 从pandas数据框的列索引获取字符串列表 - Obtain list of strings from column index of pandas dataframe 如何基于熊猫python中的另一个数据框获取数据框的子集 - How to get the subset of dataframe based on another dataframe in pandas python Python:如果记录的排序方式与列表相同,我想根据列表返回 dataframe 的子集 - Python: I would like to return a subset of dataframe based on a list, if the records are ordered the same way the list is 根据字符串列表过滤大熊猫中的数据框 - Filtering dataframe in pandas based on a list of strings Pandas:根据多指标数据帧子集的条件设置值的正确方法 - Pandas : Proper way to set values based on condition for subset of multiindex dataframe 如何根据列值获取 dataframe 的子集? - how to obtain a subset of a dataframe based on column values? 子集pandas DataFrame基于bin - Subset pandas DataFrame based on a bin Python Pandas - 基于先前获取的子集从DataFrame中删除行 - Python Pandas - Removing Rows From A DataFrame Based on a Previously Obtained Subset
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM