my data frame looks like this
df =
1324 1322 1323 1326 1327 1328 1329
278650 2.15 2.15 2.15 2.15 2.15 2.15
535947 2.15 2.15 2.15 2.15 2.15 2.15
And I want to order them like below
1322 1323 1324 1326 1327 1328 1329
278650 2.15 2.15 2.15 2.15 2.15 2.15
535947 2.15 2.15 2.15 2.15 2.15 2.15
I try to use pandas sort, sort_index
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort_index.html
but didn't figure out how it works
Is there any efficient way to do this?
also the column has missing values
1322, 1323, 1324, missing, 1326, 1327, 1328, 1329
so I want to add empty column if there exist missing.
In this case
1322 1323 1324 1325 1326 1327 1328 1329
278650 2.15 2.15 2.15 2.15 2.15 2.15
535947 2.15 2.15 2.15 2.15 2.15 2.15
note that the boundary for column is 1322 to 1373.
I solved the first problem by doing this
weeks = range(1322,1374)
df = df.loc[:,weeks]
For sorting:
http://pandas.pydata.org/pandas-docs/version/0.13.1/generated/pandas.DataFrame.sort.html
For adding new column:
Use the original df1 indexes to create the series:
df1['e'] = Series(np.random.randn(sLength), index=df1.index)
Try this:
df.sort_index(axis = 1,inplace = True) ##Sorts the DataFrame by columns (axis = 1) in place
to fix the sorting problem, and try this:
import pandas as pd
desired_cols = range(1322,1374)
for col in desired_cols:
if col not in df.columns:
df[col] = pd.Series([])
else:
pass
to add in columns that have np.nan
values.
Now since sort
is deprecated ,
Use sort_index
:
Like:
df.sort_index(axis=1,inplace=True)
Or:
df=df.sort_index(axis=1)
Both cases:
print(df)
Gets what is wanted.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.