简体   繁体   English

熊猫填充列值以具有其他列的相似值

[英]Pandas fill column values to have similar values of other column(s)

I have one date column which has timeseries data for every minute.我有一个日期列,每分钟都有时间序列数据。 I want to update my all other columns to also have data for every minute, so Date2 and Date3 columns should have same values as Date column.我想更新我的所有其他列以每分钟也有数据,所以 Date2 和 Date3 列应该与 Date 列具有相同的值。 I also want columns Value1(linked to Date 2 column) and Value2 (linked to Date3 column), to have values filled so that each row has value.我还希望列 Value1(链接到 Date 2 列)和 Value2(链接到 Date3 列)填充值,以便每行都有值。 Filling should replace always the latest value visible.填充应始终替换可见的最新值。

For example columnn Date2 row 2019-01-30 10:05, corresponding column Value1, same row should have value 3 as that wast the last updated value for stamp 10:04.例如,Date2 列第 2 行 2019-01-30 10:05,对应列 Value1,同一行应具有值 3,因为它是邮票 10:04 的最后更新值。

Finally for Date column all dates which are older than for columns Date2 and Date3 column should be removed.最后,对于 Date 列,应该删除比列 Date2 和 Date3 列旧的所有日期。 Here row 2019-01-30 10:03.这里行 2019-01-30 10:03。

    Date              Date2              Value1  Date3             Value2
   2019-01-30 10:09   2019-01-30 10:08    1      2019-01-30 10:07   5
   2019-01-30 10:08   2019-01-30 10:07    2      2019-01-30 10:04   9   
   2019-01-30 10:07   2019-01-30 10:06    4 
   2019-01-30 10:06   2019-01-30 10:04    3
   2019-01-30 10:05   
   2019-01-30 10:04
   2019-01-30 10:03

Result should be:结果应该是:

    Date              Date2              Value1  Date3             Value2
   2019-01-30 10:09   2019-01-30 10:09    1      2019-01-30 10:09   5
   2019-01-30 10:08   2019-01-30 10:08    1      2019-01-30 10:08   5
   2019-01-30 10:07   2019-01-30 10:07    2      2019-01-30 10:07   5
   2019-01-30 10:06   2019-01-30 10:06    4      2019-01-30 10:06   9
   2019-01-30 10:05   2019-01-30 10:05    3      2019-01-30 10:05   9
   2019-01-30 10:04   2019-01-30 10:04    3      2019-01-30 10:04   9

It seems you want to have the same values of dates in all columns of date, right?似乎您希望在日期的所有列中具有相同的日期值,对吗? If yes then you just copy Date to Date2 and Date3.如果是,那么您只需将日期复制到 Date2 和 Date3。 When you read columns using Pandas, the missing values are read as ' NAN ' which you can replace with DataFrame.fillna .当您使用 Pandas 读取列时,缺失值被读取为“ NAN ”,您可以将其替换为DataFrame.fillna

If you have already read columns and want them filled, a naive method would be to use columns as NumPy arrays:如果您已经阅读了列并希望它们被填充,一个简单的方法是将列用作 NumPy 数组:

  • $ Date1 = Date $ 日期 1 = 日期
  • $ latest_value = value1[-1] $ latest_value = value1[-1]
  • $ updated_values = list(value1) + list (np.ones(len(Date1)-len(value1)) $updated_values = list(value1) + list (np.ones(len(Date1)-len(value1))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM