[英]Replace value in one column if another column is missing
DATA 数据
I have a dataframe called data looks like following: 我有一个名为data的数据框,如下所示:
Name ID
JAMES 252
STEPHEN 578
JOY nan
ROGELIO 473
FACS nan
CLIFFORD 793
data['Name'] is a column of strings, and data['ID'] has numeric values.
GOAL 目标
I want to replace data['Name'] with missing value NaN whenever data['ID'] is missing, ie nan. 我想在缺少数据['ID']时将数据['Name']替换为缺失值NaN,即nan。
The result would be: 结果将是:
Name ID
JAMES 252
STEPHEN 578
NaN nan
ROGELIO 473
NaN nan
CLIFFORD 793
I have searched online but similar answers are all about using fillna() which is not what I want. 我在网上搜索,但类似的答案都是关于使用fillna(),这不是我想要的。 Do you have any suggestions on how to do this? 你对如何做这个有什么建议吗?
You can use .loc function to find all the index's where df['ID']
is null and set df['NAME']
as np.nan there 您可以使用.loc函数查找df['ID']
为空的所有索引,并将df['NAME']
为np.nan那里
import numpy as np
df.loc[df['ID'].isnull() , 'NAME'] = np.nan
How about this method? 这个方法怎么样?
import pandas as pd
import numpy as np
a = {'Name':['JAMES','STEPHEN','JOY','ROGELIO','FACS','CLIFFORD'],'ID':[252,578,np.nan,473,np.nan,793]}
df = pd.DataFrame(a)
df.loc[df['ID'].isnull() , 'Name'] = np.nan
print(df)
Output: 输出:
Name ID
0 JAMES 252.0
1 STEPHEN 578.0
2 NaN NaN
3 ROGELIO 473.0
4 NaN NaN
5 CLIFFORD 793.0
If you wish to drop the NaN values, add the following: 如果要删除NaN值,请添加以下内容:
df = df.dropna(how='any')
print(df)
Output: 输出:
Name ID
0 JAMES 252.0
1 STEPHEN 578.0
3 ROGELIO 473.0
5 CLIFFORD 793.0
Edit: I did the other way around, now it's correct. 编辑:我做了相反的方式,现在是正确的。
pandas.DataFrame.mask is perfect for this : pandas.DataFrame.mask非常适用于此:
df.mask(df['ID'].isnull())
Output: 输出:
Name ID
0 JAMES 252.0
1 STEPHEN 578.0
2 NaN NaN
3 ROGELIO 473.0
4 NaN NaN
5 CLIFFORD 793.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.