I have the following DataFrame:
A B C D
0 0.0 0.0 0.0 0.0
1 1.0 1.0 1.0 1.0
2 NaN 2.0 2.0 2.0
3 NaN 3.0 3.0 3.0
4 NaN 4.0 4.0 NaN
5 NaN NaN 5.0 NaN
6 NaN NaN 6.0 NaN
I am working to generate visualizations with this data, and I need to fill the null values in a very specific way. I want to loop the existing values repeatedly for each column until the null values are all filled, so that the DataFrame looks like this:
A B C D
0 0.0 0.0 0.0 0.0
1 1.0 1.0 1.0 1.0
2 0.0 2.0 2.0 2.0
3 1.0 3.0 3.0 3.0
4 0.0 4.0 4.0 0.0
5 1.0 0.0 5.0 1.0
6 0.0 1.0 6.0 2.0
Is there any convenient way to do this in Pandas?
You can apply
a custom function on each column that obtains the values to be iterated and then extends them to the full length of the dataframe. This can be done using np.resize
as follows:
def f(x):
vals = x[~x.isnull()].values
vals = np.resize(vals, len(x))
return vals
df = df.apply(f, axis=0)
Result:
A B C D
0 0.0 0.0 0.0 0.0
1 1.0 1.0 1.0 1.0
2 0.0 2.0 2.0 2.0
3 1.0 3.0 3.0 3.0
4 0.0 4.0 4.0 0.0
5 1.0 0.0 5.0 1.0
6 0.0 1.0 6.0 2.0
One option is with a for loop; the assumption is that the NaNs are at the end of each column, if any. Use np.place
to fill the nulls:
[np.place(df[col].to_numpy(),
df[col].isna(),
df[col].dropna().array)
for col in df
if df[col].hasnans]
[None, None, None]
df
A B C D
0 0.0 0.0 0.0 0.0
1 1.0 1.0 1.0 1.0
2 0.0 2.0 2.0 2.0
3 1.0 3.0 3.0 3.0
4 0.0 4.0 4.0 0.0
5 1.0 0.0 5.0 1.0
6 0.0 1.0 6.0 2.0
Note that np.place
is an in place operation, no assignment is needed.
Just do
df = pd.DataFrame(....).fillna(0.0)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.