在pd.merge中处理空值

Question

I need to merge two dfs which have a lot of missing values (np.nan, None and (null) ). 我需要合并两个有很多缺失值的dfs（np.nan，None和（null））。

t1= pd.DataFrame(np.array([[1,2,3],[4,5,99]]),columns=['a','b','c'])
t2= pd.DataFrame(np.array([[1,None,3,'hello'],[4,5,6,'moon']]),columns=['a','b','c','d'])
t = pd.merge(t1,t2,how='outer', on=["a","c"])

That is, the data frames are: 也就是说，数据框是：

t1 =
    a   b   c
0   1   2   3
1   4   5   99

t2 =
    a   b   c   d
0   1   None 3  hello
1   4   5   6   moon

I need a result df that gives me one row per observation, without loosing any data. 我需要一个结果df，每次观察给我一行，而不丢失任何数据。

Instead, I get a new row keeping the 'None' as a value. 相反，我得到一个新行，将'None'保持为值。

In the example above, I would like 在上面的例子中，我想

t= pd.DataFrame(np.array([[1,2,3,'hello'],[4,5,99,'moon'],[4,5,6,'moon']]),columns=['a','b','c','d'])

That is, I would like: 也就是说，我想：

t =
    a   b   c   d
0   1   2   3   hello
1   4   5   99  moon
2   4   5   6   moon

Answer 1

For you it is a special case, but you can try: 对你来说这是一个特例，但你可以尝试：

t= pd.merge(t1, t2[['a', 'd']].dropna(), how='left', on='a').append(t2.dropna())

the merge function will use t1 for your left join and append will append the missing row from t2, and from t2 you will only join column d to it, and the dropna() will drop down your None row. 合并函数将使用t1为你的左连接，append将从t2追加缺失的行，而从t2你只会将列d连接到它，而dropna（）将下拉你的无行。

在pd.merge中处理空值

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-08-14 10:43:05

在pd.merge中处理空值

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-08-14 10:43:05

解决方案1
1 已采纳 2019-08-14 10:43:05