简体   繁体   English

设置熊猫数据框中的列顺序

[英]Set order of columns in pandas dataframe

Is there a way to reorder columns in pandas dataframe based on my personal preference (ie not alphabetically or numerically sorted, but more like following certain conventions)?有没有办法根据我的个人喜好对 pandas 数据框中的列进行重新排序(即不是按字母或数字排序,而是更像遵循某些约定)?

Simple example:简单的例子:

frame = pd.DataFrame({
        'one thing':[1,2,3,4],
        'second thing':[0.1,0.2,1,2],
        'other thing':['a','e','i','o']})

produces this:产生这个:

   one thing other thing  second thing
0          1           a           0.1
1          2           e           0.2
2          3           i           1.0
3          4           o           2.0

But instead, I would like this:但相反,我想要这样:

   one thing second thing  other thing
0          1           0.1           a
1          2           0.2           e
2          3           1.0           i
3          4           2.0           o

(Please, provide a generic solution rather than specific to this case. Many thanks.) (请提供一个通用的解决方案,而不是针对这种情况。非常感谢。)

Just select the order yourself by typing in the column names.只需通过输入列名自己选择订单。 Note the double brackets:注意双括号:

frame = frame[['column I want first', 'column I want second'...etc.]]

You can use this:你可以使用这个:

columnsTitles = ['onething', 'secondthing', 'otherthing']

frame = frame.reindex(columns=columnsTitles)

Here is a solution I use very often.这是我经常使用的解决方案。 When you have a large data set with tons of columns, you definitely do not want to manually rearrange all the columns.当您拥有包含大量列的大型数据集时,您绝对不想手动重新排列所有列。

What you can and, most likely, want to do is to just order the first a few columns that you frequently use, and let all other columns just be themselves.您可以并且很可能想要做的只是订购您经常使用的前几列,而让所有其他列成为它们自己。 This is a common approach in R. df %>%select(one, two, three, everything())这是 R 中的常用方法。 df %>%select(one, two, three, everything())

So you can first manually type the columns that you want to order and to be positioned before all the other columns in a list cols_to_order .因此,您可以首先手动键入要排序的列,并在列表cols_to_order中的所有其他列之前定位。

Then you construct a list for new columns by combining the rest of the columns:然后通过组合其余列来构造新列的列表:

new_columns = cols_to_order + (frame.columns.drop(cols_to_order).tolist())

After this, you can use the new_columns as other solutions suggested.在此之后,您可以按照建议的其他解决方案使用new_columns

import pandas as pd
frame = pd.DataFrame({
    'one thing': [1, 2, 3, 4],
    'other thing': ['a', 'e', 'i', 'o'],
    'more things': ['a', 'e', 'i', 'o'],
    'second thing': [0.1, 0.2, 1, 2],
})

cols_to_order = ['one thing', 'second thing']
new_columns = cols_to_order + (frame.columns.drop(cols_to_order).tolist())
frame = frame[new_columns]

   one thing  second thing other thing more things
0          1           0.1           a           a
1          2           0.2           e           e
2          3           1.0           i           i
3          4           2.0           o           o

You could also do something like df = df[['x', 'y', 'a', 'b']]你也可以做类似df = df[['x', 'y', 'a', 'b']]

import pandas as pd
frame = pd.DataFrame({'one thing':[1,2,3,4],'second thing':[0.1,0.2,1,2],'other thing':['a','e','i','o']})
frame = frame[['second thing', 'other thing', 'one thing']]
print frame
   second thing other thing  one thing
0           0.1           a          1
1           0.2           e          2
2           1.0           i          3
3           2.0           o          4

Also, you can get the list of columns with:此外,您可以通过以下方式获取列列表:

cols = list(df.columns.values)

The output will produce something like this:输出将产生如下内容:

['x', 'y', 'a', 'b']

Which is then easy to rearrange manually.然后很容易手动重新排列。

Construct it with a list instead of a dictionary用列表而不是字典来构造它

frame = pd.DataFrame([
        [1, .1, 'a'],
        [2, .2, 'e'],
        [3,  1, 'i'],
        [4,  4, 'o']
    ], columns=['one thing', 'second thing', 'other thing'])

frame

   one thing  second thing other thing
0          1           0.1           a
1          2           0.2           e
2          3           1.0           i
3          4           4.0           o

You can also use OrderedDict:您还可以使用 OrderedDict:

In [183]: from collections import OrderedDict

In [184]: data = OrderedDict()

In [185]: data['one thing'] = [1,2,3,4]

In [186]: data['second thing'] = [0.1,0.2,1,2]

In [187]: data['other thing'] = ['a','e','i','o']

In [188]: frame = pd.DataFrame(data)

In [189]: frame
Out[189]:
   one thing  second thing other thing
0          1           0.1           a
1          2           0.2           e
2          3           1.0           i
3          4           2.0           o

Add the 'columns' parameter:添加“列”参数:

frame = pd.DataFrame({
        'one thing':[1,2,3,4],
        'second thing':[0.1,0.2,1,2],
        'other thing':['a','e','i','o']},
        columns=['one thing', 'second thing', 'other thing']
)

Try indexing (so you want a generic solution not only for this, so index order can be just what you want):尝试索引(所以你不仅需要一个通用的解决方案,所以索引顺序可以是你想要的):

l=[0,2,1] # index order
frame=frame[[frame.columns[i] for i in l]]

Now:现在:

print(frame)

Is:是:

   one thing second thing  other thing
0          1           0.1           a
1          2           0.2           e
2          3           1.0           i
3          4           2.0           o

Even though it's an old question, you can also use loc and iloc :即使这是一个老问题,您也可以使用lociloc

frame = frame.loc[:, ['column I want first', 'column I want second', "other thing"]]

frame = frame.iloc[:, [1, 3, 2]]

I find this to be the most straightforward and working:我发现这是最直接和最有效的:

df = pd.DataFrame({
        'one thing':[1,2,3,4],
        'second thing':[0.1,0.2,1,2],
        'other thing':['a','e','i','o']})

df = df[['one thing','second thing', 'other thing']]
df = df.reindex(columns=["A", "B", "C"])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM