[英]pandas: how could I order data frame by column name and add empty column
my data frame looks like this 我的数据框看起来像这样
df = df =
1324 1322 1323 1326 1327 1328 1329
278650 2.15 2.15 2.15 2.15 2.15 2.15
535947 2.15 2.15 2.15 2.15 2.15 2.15
And I want to order them like below 我想像下面这样订购
1322 1323 1324 1326 1327 1328 1329
278650 2.15 2.15 2.15 2.15 2.15 2.15
535947 2.15 2.15 2.15 2.15 2.15 2.15
I try to use pandas sort, sort_index 我尝试使用pandas sort,sort_index
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.sort_index.html http://pandas.pydata.org/pandas-docs/stable/generation/pandas.DataFrame.sort.html http://pandas.pydata.org/pandas-docs/stable/generation/pandas.DataFrame.sort_index.html
but didn't figure out how it works 但没有弄清楚它是如何工作的
Is there any efficient way to do this? 有什么有效的方法可以做到这一点吗?
also the column has missing values 该列也缺少值
1322, 1323, 1324, missing, 1326, 1327, 1328, 1329 1322、1323、1324,缺失,1326、1327、1328、1329
so I want to add empty column if there exist missing. 所以我想在缺少的情况下添加空列。
In this case 在这种情况下
1322 1323 1324 1325 1326 1327 1328 1329
278650 2.15 2.15 2.15 2.15 2.15 2.15
535947 2.15 2.15 2.15 2.15 2.15 2.15
note that the boundary for column is 1322 to 1373. 请注意,列的边界是1322至1373。
I solved the first problem by doing this 我这样做解决了第一个问题
weeks = range(1322,1374)
df = df.loc[:,weeks]
For sorting: 排序:
http://pandas.pydata.org/pandas-docs/version/0.13.1/generated/pandas.DataFrame.sort.html http://pandas.pydata.org/pandas-docs/version/0.13.1/generated/pandas.DataFrame.sort.html
For adding new column: 要添加新列:
Use the original df1 indexes to create the series: 使用原始的df1索引创建系列:
df1['e'] = Series(np.random.randn(sLength), index=df1.index)
Try this: 尝试这个:
df.sort_index(axis = 1,inplace = True) ##Sorts the DataFrame by columns (axis = 1) in place
to fix the sorting problem, and try this: 解决排序问题,然后尝试以下操作:
import pandas as pd
desired_cols = range(1322,1374)
for col in desired_cols:
if col not in df.columns:
df[col] = pd.Series([])
else:
pass
to add in columns that have np.nan
values. 添加具有
np.nan
值的列。
Now since sort
is deprecated , 现在,由于不赞成使用
sort
,
Use sort_index
: 使用
sort_index
:
Like: 喜欢:
df.sort_index(axis=1,inplace=True)
Or: 要么:
df=df.sort_index(axis=1)
Both cases: 两种情况:
print(df)
Gets what is wanted. 得到想要的东西。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.