简体   繁体   中英

pandas: how could I order data frame by column name and add empty column

my data frame looks like this

df =

        1324    1322    1323    1326    1327    1328    1329
278650  2.15    2.15    2.15    2.15    2.15            2.15
535947  2.15    2.15    2.15    2.15    2.15            2.15

And I want to order them like below

        1322    1323    1324    1326    1327    1328    1329
278650  2.15    2.15    2.15    2.15    2.15            2.15
535947  2.15    2.15    2.15    2.15    2.15            2.15

I try to use pandas sort, sort_index

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort_index.html

but didn't figure out how it works

Is there any efficient way to do this?

also the column has missing values

1322, 1323, 1324, missing, 1326, 1327, 1328, 1329

so I want to add empty column if there exist missing.

In this case

        1322    1323    1324    1325    1326    1327    1328    1329
278650  2.15    2.15    2.15            2.15    2.15            2.15
535947  2.15    2.15    2.15            2.15    2.15            2.15

note that the boundary for column is 1322 to 1373.


I solved the first problem by doing this

     weeks = range(1322,1374)
     df = df.loc[:,weeks]

For sorting:

http://pandas.pydata.org/pandas-docs/version/0.13.1/generated/pandas.DataFrame.sort.html

For adding new column:

Use the original df1 indexes to create the series:

df1['e'] = Series(np.random.randn(sLength), index=df1.index)

Try this:

df.sort_index(axis = 1,inplace = True) ##Sorts the DataFrame by columns (axis = 1) in place

to fix the sorting problem, and try this:

import pandas as pd
desired_cols = range(1322,1374)
for col in desired_cols:
    if col not in df.columns:
        df[col] = pd.Series([])
    else:
        pass

to add in columns that have np.nan values.

Now since sort is deprecated ,

Use sort_index :

Like:

df.sort_index(axis=1,inplace=True)

Or:

df=df.sort_index(axis=1)  

Both cases:

print(df)

Gets what is wanted.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM