I have a pandas DataFrame df1 with the following content:
Serial N year current
B 10 14
B 10 16
B 11 10
B 11
B 11 15
C 12 11
C 9
C 12 13
C 12
I would like to make a DataFrame that is based on df1
but that has any row containing an empty value removed. For example:
Serial N year current
B 10 14
B 10 16
B 11 10
B 11 15
C 12 11
C 12 13
I tried something like this
df1=df[~np.isnan(df["year"]) or ~np.isnan(df["current"])]
But I received the following error:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
What could be the problem?
Please try with bitwise operator |
instead, like this:
df1=df[ (~np.isnan(df["year"])) | (~np.isnan(df["current"]))]
Using dropna()
, as suggested by EdChum, is likely the cleanest and neatest solution here. You can read more about this or working with missing data generally here
You can just call dropna
to achieve this:
df1 = df.dropna()
As to why what you tried failed or
operator doesn't understand what it should do when comparing array like structures as it is ambiguous if 1 or more elements meet the boolean criteria, you should use the bitwise operators &
, |
and ~
for and
, or
and not
repsectively. Additionally for multiple conditions you need to wrap the conditions in parentheses due to operator precedence.
In [4]:
df.dropna()
Out[4]:
Serial N year current
0 B 10 14
1 B 10 16
2 B 11 10
4 B 11 15
5 C 12 11
7 C 12 13
if you really have empty cells instead of NaN's:
In [122]: df
Out[122]:
Serial_N year current
0 B 10.0 14.0
1 B 10.0 16.0
2 B 11.0 10.0
3 B 11.0
4 B 11.0 15.0
5 C 12.0 11.0
6 C 9.0
7 C 12.0 13.0
8 C 12.0
In [123]: a.replace('', np.nan).dropna()
Out[123]:
Serial_N year current
0 B 10.0 14.0
1 B 10.0 16.0
2 B 11.0 10.0
4 B 11.0 15.0
5 C 12.0 11.0
7 C 12.0 13.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.