简体   繁体   English

如何根据先前的值来估算缺失值?

[英]How to impute the missing values depending on previous values?

I have data:我有数据:

city        state    country Continent
Saint-Denis NaN      France  Europe
Saint-Denis NaN      NaN     Europe
Saint-Denis NaN      NaN     Europe
Kinshasa    NaN      NaN     Africa
Kinshasa    NaN      NaN     Africa

I am expecting to create the function which will analyze the similar cases and impute the country value of it.我期待创建 function,它将分析类似案例并估算其国家/地区价值。

I'm using the below code:我正在使用以下代码:

for i in range(0, len(df)):
    if df['city'][i] == 'Saint-Denis' and pd.isnull(df['country'].iloc[i]):
        df.country = 'France'
    else:
        pass

It is replacing the NaN , but not for specific city.它正在取代NaN ,但不是针对特定城市。 It is replacing all NaN values.它正在替换所有NaN值。

You have a typo in your solution in the third line you are missing the index it should be like this df.country[i] = 'France' .您在第三行的解决方案中有错字,您缺少索引,它应该像这样df.country[i] = 'France' Also, you can get the same result with pandas apply which should be faster:此外,您可以使用 pandas apply 获得相同的结果,这应该更快:

df["country"] = df.apply(lambda x: "France" if (x.city=="Saint-Denis" and pd.isnull(x.country)) else x.country, axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM