[英]How to get indices of NaN values in a DataFrame and after filling set it to NaN againTime?
Time A B C D E F G H I K
2019-06-17 08:45:00 12089.89 12089.89 12087.71 12087.71 0.026626 2495.595 2495.595 2486.095 2488.095 0.000705
2019-06-17 08:46:00 12087.91 NaN 12087.71 12087.91 0.023684 2489.095 2490.095 2486.095 2486.095 0.000613
2019-06-17 08:47:00 12088.21 12088.21 12084.21 12085.21 0.028582 2487.095 2487.595 2485.095 2486.095 0.000516
2019-06-17 08:48:00 12085.09 12090.21 12084.91 12089.41 0.033238 2485.095 2485.595 2485.095 2485.095 0.000108
2019-06-17 08:49:00 12089.71 12090.21 12087.21 12088.21 0.033204 2484.095 2484.095 2484.095 2484.095 0.000010
... ... ... ... ... ... ... ... ... ...
2019-07-08 23:03:00 12504.11 12504.11 12504.11 12504.11 0.000734 NaN NaN NaN NaN NaN
2019-07-08 23:04:00 12504.11 NaN 12503.11 12503.11 0.002394 NaN NaN NaN NaN NaN
2019-07-08 23:08:00 12504.11 12504.11 12503.11 12503.11 0.002294 NaN NaN NaN NaN NaN
2019-07-08 23:09:00 12503.61 12503.61 12503.61 12503.61 0.000734 NaN NaN NaN NaN NaN
2019-07-08 23:10:00 12503.61 12503.61 12503.11 12503.11 0.002294 NaN NaN NaN NaN NaN
In a DataFrame like this: how to get the locations of all the rows which contain NaN?在这样的 DataFrame 中:如何获取包含 NaN 的所有行的位置? (where NOT the whole row is NaN, but at least one NaN) After that they get cleaned by.ffill(), but later I need to set these specific indices NaN again.
(不是整行是 NaN,而是至少一个 NaN)之后它们被.ffill() 清理,但稍后我需要再次设置这些特定索引 NaN。
#1. select the df without the columns that are Full-NaN
df2 = df.dropna(how='All')
#2. select the indices of the still NaN containing rows.
indices = ???
#3. filling
df2.fillna(method="ffill")
#4. irrelevant manipulation and extraction stuff
#...
#5. set the NaNs back to where they were.
# df[indices] = ...
Try:尝试:
indices = df.loc[df.isnull().any(axis=1)].index
Sample Dataframe:样品 Dataframe:
df:东风:
a b c d
0 NaN 1 2 NaN
1 1.0 2 3 4.0
2 NaN 1 2 3.0
3 1.0 2 3 NaN
4 1.0 4 5 6.0
indices:指数:
Int64Index([0, 2, 3], dtype='int64')
df.loc[indices]: df.loc[索引]:
a b c d
0 NaN 1 2 NaN
2 NaN 1 2 3.0
3 1.0 2 3 NaN
import pandas as pd
from numpy import nan
data = {'Name': ['Tom', 'nick', 'krish', 'jack'], 'Age': [nan, 21, nan, 18]}
df = pd.DataFrame(data)
print(df)
print("================")
is_NaN = df.isnull()
rows_have_NaN = is_NaN.any(axis=1)
print(df[rows_have_NaN])
output: output:
Name Age
0 Tom NaN
1 nick 21.0
2 krish NaN
3 jack 18.0
================
Name Age
0 Tom NaN
2 krish NaN
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.