如果不是 NAN，则用以前的值替换 Pandas 中的缺失值

Question

I need your help with the following code.我需要您对以下代码的帮助。 I have df1 with Exchange Rate and Date Columns that I'm trying to merge with df2.我有 df1 和我试图与 df2 合并的汇率和日期列。 The df1 has missing values for the Exchange Rates (on weekends and holidays). df1 缺少汇率值（周末和节假日）。 For the weekends exchange rates values i want to use the the last available value (for example, if Exchange Rate for 2019-05-01 is nan, i want it to use the 2019-04-01 Exchange rate value).对于周末汇率值，我想使用最后一个可用值（例如，如果 2019-05-01 的汇率为 nan，我希望它使用 2019-04-01 汇率值）。 I've tried unsuccessfuly two options:我尝试了两种选择但没有成功：

eliminate the nan values from DF1 and somehow indicate merge to get the last available value if it doesn't find the date (cause we eliminated it)从 DF1 中消除 nan 值并以某种方式指示合并以获取最后一个可用值，如果它没有找到日期（因为我们消除了它）
Fill the df1 nan values with the last available value.用最后一个可用值填充 df1 nan 值。

Here are both dataframes (if you copy and paste it you get an error that Timestamp name is not recognized. I couldnt get the date value to paste it here since i always got the date value as a TimeStamp object).这是两个数据帧（如果你复制并粘贴它，你会得到一个错误，即时间戳名称无法识别。我无法获得将它粘贴到这里的日期值，因为我总是将日期值作为时间戳对象）。 I hope you can help me solve both ways since i'm sure it will be usefull to know.我希望你能帮我解决这两种方法，因为我相信知道它会很有用。

df1={'Fecha': {0: Timestamp('2019-01-01 00:00:00'),
  1: Timestamp('2019-01-02 00:00:00'),
  2: Timestamp('2019-01-03 00:00:00'),
  3: Timestamp('2019-01-04 00:00:00'),
  4: Timestamp('2019-01-05 00:00:00'),
  5: Timestamp('2019-01-06 00:00:00'),
  6: Timestamp('2019-01-07 00:00:00'),
  7: Timestamp('2019-01-08 00:00:00'),
  8: Timestamp('2019-01-09 00:00:00'),
  9: Timestamp('2019-01-10 00:00:00')},
 'ER': {0: nan,
  1: 19.1098,
  2: 19.2978,
  3: 19.2169,
  4: nan,
  5: nan,
  6: 19.076,
  7: 19.1627,
  8: nan,
  9: 19.7792}}



df2={'Fecha': {0: Timestamp('2019-01-01 00:00:00'),
  1: Timestamp('2019-01-02 00:00:00'),
  2: Timestamp('2019-01-03 00:00:00'),
  3: Timestamp('2019-01-04 00:00:00'),
  4: Timestamp('2019-01-05 00:00:00'),
  5: Timestamp('2019-01-06 00:00:00'),
  6: Timestamp('2019-01-07 00:00:00'),
  7: Timestamp('2019-01-08 00:00:00'),
  8: Timestamp('2019-01-09 00:00:00'),
  9: Timestamp('2019-01-10 00:00:00')},
 'letters': {0: "a",
  1: "b",
  2: "c",
  3: "d",
  4: "e",
  5: "f",
  6: "g",
  7: "h",
  8: "i",
  9: "j"}}

thanks a lot!多谢！

Answer 1

I don't think you need lambda (as you mentioned in the comments).我认为您不需要 lambda（正如您在评论中提到的）。 What you're trying to achieve could be done by .ffill method:您想要实现的目标可以通过.ffill方法完成：

>>> df1["ER"].ffill()
0        NaN
1    19.1098
2    19.2978
3    19.2169
4    19.2169
5    19.2169
6    19.0760
7    19.1627
8    19.1627
9    19.7792
Name: ER, dtype: float64

To merge two dataframes, use pd.merge :要合并两个数据帧，请使用pd.merge ：

>>> df1["ER"].ffill(inplace=True)
>>> pd.merge(df1, df2, on="Fecha")
       Fecha       ER letters
0 2019-01-01      NaN       a
1 2019-01-02  19.1098       b
2 2019-01-03  19.2978       c
3 2019-01-04  19.2169       d
4 2019-01-05  19.2169       e
5 2019-01-06  19.2169       f
6 2019-01-07  19.0760       g
7 2019-01-08  19.1627       h
8 2019-01-09  19.1627       i
9 2019-01-10  19.7792       j

Answer 2

Just for general knowledge: in your exemple's data, it will raise an error for not recognized 'Timestamp' and 'nan'.仅用于一般知识：在您示例的数据中，它会因无法识别的“时间戳”和“nan”而引发错误。 To make this dataset avaiable you just have to add the pandas or pd before de Timestamp:要使此数据集可用，您只需在 de Timestamp 之前添加pandas或pd ：

pd.Timestamp('2019-01-06 00:00:00')

And for indicate null values, you could use:对于指示空值，您可以使用：

# First option - pandas system
import pandas as pd
{0: pd.NA}

# Second option - numpy system
import numpy as np
{0: np.nan}

# Third oprtion - Pure python
{0: None}

Answer 3

I found a way to achieve this using the pd.merge_asof() function.我找到了一种使用 pd.merge_asof() 函数来实现这一点的方法。 If it doesn't find the keyvalue to merge, it gives you the previous one.如果它没有找到要合并的键值，它会给你前一个。 Sorting is crucial, though.不过，排序很重要。

It works just as the excel lookup (NOT VLOOK UP, but LOOKUP -without the v or the h-).它的工作原理与 excel 查找一样（不是 VLOOK UP，而是 LOOKUP - 没有 v 或 h-）。

thanks everyone!谢谢大家！

如果不是 NAN，则用以前的值替换 Pandas 中的缺失值

问题描述

3 个解决方案

解决方案1
1 2020-10-30 22:22:59

解决方案2
0 2020-10-31 20:03:09

解决方案3
0 已采纳 2020-11-07 23:07:23

如果不是 NAN，则用以前的值替换 Pandas 中的缺失值

问题描述

3 个解决方案

解决方案1 1 2020-10-30 22:22:59

解决方案2 0 2020-10-31 20:03:09

解决方案3 0 已采纳 2020-11-07 23:07:23

解决方案1
1 2020-10-30 22:22:59

解决方案2
0 2020-10-31 20:03:09

解决方案3
0 已采纳 2020-11-07 23:07:23