[英]How to solve the issue of filling all the values with None?
I am trying to fill the missing values in the data frame, but all of the values were replaced with None
.我正在尝试填充数据框中的缺失值,但所有值都被替换为
None
。
Here is the example I have tried:这是我尝试过的示例:
# Basic libraries
import os
import pandas as pd
import numpy as np
# Visualization libraries
import matplotlib.pyplot as plt
import seaborn as sns
import folium
#import folium.plugins as plugins
from wordcloud import WordCloud
import plotly.express as px
data_dict = {'First':[100, 90, np.nan, 95],
'Second': [30, 45, 56, np.nan],
'Third':[np.nan, 40, 80, 98]}
#reating a dataframe from list
df1 = pd.DataFrame(data_dict)
#first_try_with_column_name
df1.loc[:,'First'] = df1.loc[:,'First'].fillna(method='ffill', inplace=True)
#Second_try_Using_List_of_Columns
list_columns = ['First','Second','Third']
df1.loc[:,list_columns] = df1.loc[:,list_columns].fillna(value, inplace=True)
df1
As shown, I used multiple ways to understand the reason behind this issue, so I tried to use the column name, and then I used a list of column names, but unfortunately, the issue is the same.如图,我用了多种方式来理解这个问题背后的原因,所以我尝试使用列名,然后我使用了一个列名列表,但不幸的是,问题是一样的。
Is there any recommendation, please?请问有什么推荐的吗?
change改变
df1.loc[:,'First'] = df1.loc[:,'First'].fillna(method='ffill', inplace=True)
to到
df1.loc[:,'First'].fillna(method='ffill', inplace=True)
this is because you are using inplace=True which means changes will be made to the original dataframe.这是因为您使用的是 inplace=True 这意味着将对原始数据框进行更改。
As for the None values, they come from the function returning None as it's inplace and there is nothing to return.至于 None 值,它们来自返回 None 的函数,因为它就位并且没有任何返回。 Hence, all the values become None.
因此,所有值都变为 None。
For each column,对于每一列,
for col in df1.columns:
df1[col].fillna(10, inplace=True)
df1
PS: For the future user, -- avoid inplace because In pandas, is inplace = True considered harmful, or not? PS:对于未来的用户,--避免就地,因为在熊猫中,就地 = True 是否被认为有害?
If you want to forward fill you can just do:如果你想转发填充你可以这样做:
df1 = df1.ffill()
This results in:这导致:
First Second Third
0 100.0 30.0 NaN
1 90.0 45.0 40.0
2 90.0 56.0 80.0
3 95.0 56.0 98.0
There's still one nan value, so we could do a backfill still:还有一个 nan 值,所以我们仍然可以做一个回填:
df1 = df1.bfill()
Final result:最后结果:
First Second Third
0 100.0 30.0 40.0
1 90.0 45.0 40.0
2 90.0 56.0 80.0
3 95.0 56.0 98.0
If you only want to forward fill na's in specific columns, then use the following.如果您只想转发特定列中的填充 na,请使用以下内容。 Please note I am NOT using
inplace=True
.请注意我没有使用
inplace=True
。 This was the reason why you're code wasn't working before.这就是您的代码以前无法正常工作的原因。
columns_to_fillna = ['Second', 'Third']
df1.loc[:, columns_to_fillna] = df1.loc[:, columns_to_fillna].ffill()
If you really want to use inplace=True
, which is not be advised, then do:如果您真的想使用
inplace=True
,但不建议这样做,请执行以下操作:
columns_to_fillna = ['Second', 'Third']
df1.loc[:, columns_to_fillna].ffill(inplace=True)
Reason why inplace is not advised, is discussed here:不建议就地的原因,在此处讨论:
https://stackoverflow.com/a/60020384/6366770 https://stackoverflow.com/a/60020384/6366770
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.