[英]Dropping Pandas Dataframe Columns based on column name
I have a Pandas dataframe df with extraneous information.我有一个带有无关信息的 Pandas 数据框 df。 The extraneous information is stored in the columnns that have names containing "PM".
无关信息存储在名称包含“PM”的列中。 I would like to remove these columns but I'm not sure how to.
我想删除这些列,但我不确定如何删除。 Below is my attempt to do this.
下面是我尝试做到这一点。 However, I received this error message: AttributeError: 'numpy.float64' object has no attribute 'PM'.
但是,我收到此错误消息:AttributeError: 'numpy.float64' object has no attribute 'PM'。 I'm not sure how to interpret this error message.
我不确定如何解释此错误消息。 I also don't understand why numpy is mentioned in the message since the dataframe df is a pandas object.
我也不明白为什么消息中提到了 numpy,因为数据帧 df 是一个熊猫对象。
for j in range(0,len(df.columns)-1):
df.iloc[0,j].str.contains("PM"):
df.drop(j, axis=1)
AttributeError: 'numpy.float64' object has no attribute 'PM' AttributeError: 'numpy.float64' 对象没有属性 'PM'
If need remove columns with PM
it is same like select columns without PM
, so use:如果需要删除带有
PM
列,它就像选择没有PM
列一样,所以使用:
df.loc[:, ~df.columns.str.contains("PM")]
Out of box solution:开箱即用的解决方案:
df.drop(df.filter(like='PM').columns, axis=1)
Use regex
with filter
:使用带有
filter
regex
:
df.filter(regex='^((?!PM).)*$')
This is the shortest solution here.这是这里最短的解决方案。
Based on what I understood, you want to delete columns, so initially you should store all the column names in a list.根据我的理解,您想要删除列,因此最初您应该将所有列名存储在一个列表中。 Next remove elements from list if 'PM' is present, then finally drop the columns.
如果存在“PM”,则接下来从列表中删除元素,然后最后删除列。
columns = list(df.columns.values)
[columns.remove(col) for col in columns if 'PM' in col]
df.drop(columns=columns, axis=1, inplace=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.