简体   繁体   English

如何在 DataFrame 中获取 NaN 值的索引,并在填充后再次将其设置为 NaN?

[英]How to get indices of NaN values in a DataFrame and after filling set it to NaN againTime?

Time                           A            B           C             D              E           F           G         H            I              K                                                                                                 
2019-06-17 08:45:00     12089.89     12089.89    12087.71      12087.71       0.026626    2495.595    2495.595   2486.095     2488.095      0.000705
2019-06-17 08:46:00     12087.91          NaN    12087.71      12087.91       0.023684    2489.095    2490.095   2486.095     2486.095      0.000613
2019-06-17 08:47:00     12088.21     12088.21    12084.21      12085.21       0.028582    2487.095    2487.595   2485.095     2486.095      0.000516
2019-06-17 08:48:00     12085.09     12090.21    12084.91      12089.41       0.033238    2485.095    2485.595   2485.095     2485.095      0.000108
2019-06-17 08:49:00     12089.71     12090.21    12087.21      12088.21       0.033204    2484.095    2484.095   2484.095     2484.095      0.000010
                         ...          ...         ...           ...            ...         ...         ...        ...          ...           ...
2019-07-08 23:03:00     12504.11     12504.11    12504.11      12504.11       0.000734         NaN         NaN        NaN          NaN           NaN
2019-07-08 23:04:00     12504.11          NaN    12503.11      12503.11       0.002394         NaN         NaN        NaN          NaN           NaN
2019-07-08 23:08:00     12504.11     12504.11    12503.11      12503.11       0.002294         NaN         NaN        NaN          NaN           NaN
2019-07-08 23:09:00     12503.61     12503.61    12503.61      12503.61       0.000734         NaN         NaN        NaN          NaN           NaN
2019-07-08 23:10:00     12503.61     12503.61    12503.11      12503.11       0.002294         NaN         NaN        NaN          NaN           NaN

In a DataFrame like this: how to get the locations of all the rows which contain NaN?在这样的 DataFrame 中:如何获取包含 NaN 的所有行的位置? (where NOT the whole row is NaN, but at least one NaN) After that they get cleaned by.ffill(), but later I need to set these specific indices NaN again. (不是整行是 NaN,而是至少一个 NaN)之后它们被.ffill() 清理,但稍后我需要再次设置这些特定索引 NaN。

#1. select the df without the columns that are Full-NaN
df2 = df.dropna(how='All')

#2. select the indices of the still NaN containing rows.
indices = ???

#3. filling
df2.fillna(method="ffill")

#4. irrelevant manipulation and extraction stuff
#...

#5. set the NaNs back to where they were.
# df[indices] = ...

Try:尝试:

indices = df.loc[df.isnull().any(axis=1)].index

Sample Dataframe:样品 Dataframe:

df:东风:

    a   b   c   d
0   NaN 1   2   NaN
1   1.0 2   3   4.0
2   NaN 1   2   3.0
3   1.0 2   3   NaN
4   1.0 4   5   6.0

indices:指数:

Int64Index([0, 2, 3], dtype='int64')

df.loc[indices]: df.loc[索引]:

    a   b   c   d
0   NaN 1   2   NaN
2   NaN 1   2   3.0
3   1.0 2   3   NaN
import pandas as pd
from numpy import nan

data = {'Name': ['Tom', 'nick', 'krish', 'jack'], 'Age': [nan, 21, nan, 18]}
df = pd.DataFrame(data)
print(df)
print("================")
is_NaN = df.isnull()
rows_have_NaN = is_NaN.any(axis=1)
print(df[rows_have_NaN])

output: output:

   Name   Age
0    Tom   NaN
1   nick  21.0
2  krish   NaN
3   jack  18.0
================
    Name  Age
0    Tom  NaN
2  krish  NaN

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM