简体   繁体   English

子集pandas数据帧并保留原始大小

[英]Subsetting pandas dataframe and retain original size

I am trying to subset a dataframe but want the new dataframe to have same size of original dataframe. 我正在尝试对数据帧进行子集化,但希望新数据帧具有相同大小的原始数据帧。
Attaching the input, output and the expected output. 附加输入,输出和预期输出。

df_input = pd.DataFrame([[1,2,3,4,5], [2,1,4,7,6], [5,6,3,7,0]], columns=["A", "B","C","D","E"])

df_output=pd.DataFrame(df_input.iloc[1:2,:])

df_expected_output=pd.DataFrame([[0,0,0,0,0], [2,1,4,7,6], [0,0,0,0,0]], columns=["A", "B","C","D","E"])  

Please suggest the way forward. 请建议前进的方向。

Set the index after you subset back to the original with reindex . 使用reindex将子集回原点后设置reindex This will set all the values for the new rows to NaN , which you can replace with 0 via fillna . 这会将新行的所有值设置为NaN ,您可以通过fillna替换为0。 Since NaN is a floa t type, you can convert everything back to int with astype . 由于NaNfloa类型,因此您可以使用astype将所有内容转换回int

 df_input.iloc[1:2,:].reindex(df_input.index).fillna(0).astype(int)

Setup 建立

df = pd.DataFrame([[1,2,3,4,5], [2,1,4,7,6], [5,6,3,7,0]], columns=["A", "B","C","D","E"])
output = df_input.iloc[1:2,:]

You can create a mask and use multiplication: 您可以创建一个mask并使用乘法:

m = df.index.isin(output.index)
m[:, None] * df

   A  B  C  D  E
0  0  0  0  0  0
1  2  1  4  7  6
2  0  0  0  0  0

I will using where + between 我将使用where + between

df_input.where(df_input.index.to_series().between(1,1),other=0)
Out[611]: 
   A  B  C  D  E
0  0  0  0  0  0
1  2  1  4  7  6
2  0  0  0  0  0

One more option would be to create DataFrame with zero values and then update it with df_input slice 还有一个选项是创建具有零值的DataFrame,然后使用df_input slice更新它

df_output = pd.DataFrame(0, index=df_input.index, columns = df_input.columns)
df_output.update(df_input.iloc[1:2,:])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM