[英]Subsetting pandas dataframe and retain original size
I am trying to subset a dataframe but want the new dataframe to have same size of original dataframe. 我正在尝试对数据帧进行子集化,但希望新数据帧具有相同大小的原始数据帧。
Attaching the input, output and the expected output. 附加输入,输出和预期输出。
df_input = pd.DataFrame([[1,2,3,4,5], [2,1,4,7,6], [5,6,3,7,0]], columns=["A", "B","C","D","E"])
df_output=pd.DataFrame(df_input.iloc[1:2,:])
df_expected_output=pd.DataFrame([[0,0,0,0,0], [2,1,4,7,6], [0,0,0,0,0]], columns=["A", "B","C","D","E"])
Please suggest the way forward. 请建议前进的方向。
Set the index after you subset back to the original with reindex
. 使用
reindex
将子集回原点后设置reindex
。 This will set all the values for the new rows to NaN
, which you can replace with 0 via fillna
. 这会将新行的所有值设置为
NaN
,您可以通过fillna
替换为0。 Since NaN
is a floa
t type, you can convert everything back to int
with astype
. 由于
NaN
是floa
类型,因此您可以使用astype
将所有内容转换回int
。
df_input.iloc[1:2,:].reindex(df_input.index).fillna(0).astype(int)
Setup 建立
df = pd.DataFrame([[1,2,3,4,5], [2,1,4,7,6], [5,6,3,7,0]], columns=["A", "B","C","D","E"])
output = df_input.iloc[1:2,:]
You can create a mask
and use multiplication: 您可以创建一个
mask
并使用乘法:
m = df.index.isin(output.index)
m[:, None] * df
A B C D E
0 0 0 0 0 0
1 2 1 4 7 6
2 0 0 0 0 0
I will using where
+ between
我将使用
where
+ between
df_input.where(df_input.index.to_series().between(1,1),other=0)
Out[611]:
A B C D E
0 0 0 0 0 0
1 2 1 4 7 6
2 0 0 0 0 0
One more option would be to create DataFrame with zero values and then update it with df_input slice 还有一个选项是创建具有零值的DataFrame,然后使用df_input slice更新它
df_output = pd.DataFrame(0, index=df_input.index, columns = df_input.columns)
df_output.update(df_input.iloc[1:2,:])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.