简体   繁体   English

如何删除DataFrame中除某些列外的所有列?

[英]How to delete all columns in DataFrame except certain ones?

Let's say I have a DataFrame that looks like this:假设我有一个如下所示的 DataFrame:

a  b  c  d  e  f  g  
1  2  3  4  5  6  7
4  3  7  1  6  9  4
8  9  0  2  4  2  1

How would I go about deleting every column besides a and b ?我 go 如何删除ab以外的每一列?

This would result in:这将导致:

a  b
1  2
4  3
8  9

I would like a way to delete these using a simple line of code that says, delete all columns besides a and b , because let's say hypothetically I have 1000 columns of data.我想要一种使用简单的代码行删除这些列的方法,即删除ab之外的所有列,因为假设我有 1000 列数据。

Thank you.谢谢你。

In [48]: df.drop(df.columns.difference(['a','b']), 1, inplace=True)
Out[48]:
   a  b
0  1  2
1  4  3
2  8  9

or:或:

In [55]: df = df.loc[:, df.columns.intersection(['a','b'])]

In [56]: df
Out[56]:
   a  b
0  1  2
1  4  3
2  8  9

PS please be aware that the most idiomatic Pandas way to do that was already proposed by @Wen : PS 请注意, @Wen 已经提出了最惯用的 Pandas 方法

df = df[['a','b']]

or

df = df.loc[:, ['a','b']]

Another option to add to the mix.添加到组合中的另一种选择。 I prefer this approach for readability.我更喜欢这种方法的可读性。

df = df.filter(['a', 'b'])

Where the first positional argument is items=[]其中第一个位置参数是items=[]


Bonus奖金

You can also use a like argument or regex to filter.您还可以使用like参数或regex进行过滤。
Helpful if you have a set of columns like ['a_1','a_2','b_1','b_2']如果您有一组像['a_1','a_2','b_1','b_2']

You can do你可以做

df = df.filter(like='b_')

and end up with ['b_1','b_2']并以['b_1','b_2']结尾

Pandas documentation for filter. 过滤器的 Pandas 文档。

there are multiple solution .有多种解决方案。

df = df[['a','b']] #1

df = df[list('ab')] #2

df = df.loc[:,df.columns.isin(['a','b'])] #3

df = pd.DataFrame(data=df.eval('a,b').T,columns=['a','b']) #4 PS:I do not recommend this method , but still a way to achieve this 

If you have more than two columns that you want to drop, let's say 20 or 30 , you can use lists as well.如果您要删除多于两列,比如说2030 ,您也可以使用列表。 Make sure that you also specify the axis value.确保您还指定了轴值。

drop_list = ["a","b"]
df = df.drop(df.columns.difference(drop_list), axis=1)

Hey what you are looking for is:嘿,你要找的是:

df = df[["a","b"]]

You will recive a dataframe which only contains the columns a and b您将收到一个 dataframe,其中仅包含 a 和 b 列

如果您只想保留多于您删除的列,请在 .isin 语句前添加“~”以选择除您想要的列之外的每一列:

df = df.loc[:, ~df.columns.isin(['a','b'])]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM