简体   繁体   English

MySQL-将2个表与“重复值”连接起来以填补时间序列中的空白

[英]MySQL - Join 2 tables with “repeating values” to fill in gaps in time series

So I have 2 tables I need to join. 所以我有2张桌子需要加入。 Table1 includes a column called JDAY which hold the values 1.5, 2.5, 3.5...365.5. 表1包含名为JDAY的列,其中包含值1.5、2.5、3.5 ... 365.5。 It looks like: 看起来像:

JDAY
1.5
2.5
3.5
4.5
5.5
etc.

I would like to join it with Table2, which looks like: 我想将其与Table2合并,如下所示:

JDAY  WSC
1     1
5     .9
8     .7
366   .5

The final result should have a JDAY column with all the values from Table1, and the WSC value from Table2 corresponding to the closest JDAY value less than or equal to that in Table2. 最终结果应该具有一个JDAY列,其中包含表1中的所有值,并且表2中与最接近的JDAY值对应的WSC值小于或等于表2中的值。 For example, JDAY=5.5 in Table1 corresponds to WSC=.9, because 5.5 is between 5 and 8. It would look like this: 例如,表1中的JDAY = 5.5对应于WSC = .9,因为5.5在5到8之间。它看起来像这样:

JDAY   WSC
1.5    1
2.5    1
3.5    1
4.5    1
5.5    .9
6.5    .9
7.5    .9
8.5    .7
9.5    .7
etc.

One way to get the result is to use a correlated subquery: 获得结果的一种方法是使用相关子查询:

UPDATED 更新

SELECT t1.JDAY
     , (SELECT t2.WSC FROM Table2 t2 
         WHERE t2.JDAY < t1.JDAY
         ORDER BY t2.JDAY DESC LIMIT 1
       ) AS WSC
  FROM Table1 t1
 ORDER BY t1.JDAY 

You noted you wanted the largest value of WSC less than JDAY; 您指出,您希望 WSC的最大价值 低于JDAY; I'm taking that to mean that you want the maximum value of WSC from the rows in Table2 that have a JDAY value LESS THAN the JDAY value from the Table1 row. 我认为这意味着您想要Table2中具有JDAY值的行中 WSC 最大值 比Table1行中的JDAY值少。 That seems to fit with the result set you are showing. 这似乎与您显示的结果集相符。 (If I got that wrong (which I may have), it was my misunderstanding, reading something you didn't intend, if so, please feel free to correct me.) (如果我错了(可能是我错了),那是我的误解,阅读了您不想要的内容,如果是的话,请随时纠正我。)

UPDATE 更新

I have a better understanding, you want the value of WSC from the row in Table2 that has the largest JDAY value that is earlier than the JDAY value in Table1. 我有一个更好的理解,您希望表2中具有最大JDAY值的行中WSC的值早于表1中的JDAY值。 You want the WSC from the row with with the latest earlier date. 您需要具有较早日期的行中的WSC。 The queries in my answer are modified to satisfy this requirement. 我回答中的查询已修改以满足此要求。


Another alternative to get basically the same result set is to use a theta-join (that is, a join on an inequality predicate) 获得基本相同结果集的另一种方法是使用theta-join(即不等式谓词上的联接)

UPDATED 更新

SELECT t.JDAY
     , s.WSC
  FROM ( SELECT t1.JDAY
              , MAX(t2.JDAY) AS max_t2_jday
           FROM Table1 t1
           LEFT
           JOIN Table2 t2
             ON t2.JDAY < t1.JDAY
          GROUP BY t1.JDAY
          ORDER BY t1.JDAY
       ) t
  LEFT
  JOIN Table2 s
    ON s.JDAY = t.max_t2_jday

That will work as long as there aren't any "duplicate" values of JDAY in Table2. 只要在Table2中没有JDAY的任何“重复”值,那将起作用。 If there are duplicate values, you'll need to have a way to eliminate any duplicates that get returned. 如果存在重复的值,则需要有一种方法来消除所有返回的重复项。 Adding a GROUP BY t.JDAY would be sufficient, but not deterministic. GROUP BY t.JDAY添加GROUP BY t.JDAY ,但不能确定。 (That is, you wouldn't be guaranteed which row you were getting the WSC value from.) (也就是说,您将无法保证从中获得WSC值。)

Here, we are asking for every row from Table1, AND any "matching" rows from Table2, matching on the JDAY from Table2 being LESS THAN the JDAY from Table1. 在这里,我们要查询表1中的每一行,以及表2中的所有“匹配”行,在表2中的JDAY上匹配要比表1中的JDAY少。 For each row in Table1, we will rummage through all the "matching" rows from Table2 (if any) and we will pick out the largest WSC value matching JDAY from Table2 and return it in the subquery (inline view) aliased as t. 对于表1中的每一行,我们将遍历表2中的所有“匹配”行(如果有),然后从表2中选择与JDAY匹配的最大 WSC值 ,并将其返回给别名为t的子查询(内联视图)。 We can then join that result set back to Table2, to get the matching row, along with the WSC value on that row. 然后,我们可以将该结果集重新连接到Table2,以获取匹配的行以及该行的WSC值。

One difference between these queries comes into play when there are "duplicate" values of JDAY in Table1, they will return slightly different result sets. 当表1中的JDAY值“重复”时,这些查询之间就会出现一个区别,它们将返回略有不同的结果集。 The first query will return all the rows from Table1, including duplicate values of JDAY. 第一个查询将返回表1中的所有行,包括JDAY的重复值。 The second query will remove duplicate values of JDAY from Table 1, and return a distinct set of values for JDAY. 第二个查询将从表1中删除JDAY的重复值,并为JDAY返回一组不同的值。


UPDATE: 更新:

I've updated the answer above, not getting the MAX(WSC) value, but instead getting the WSC value from the row in Table2 with the closest JDAY that is less than the Table1 JDAY. 我已经更新了上面的答案,没有获得MAX(WSC)值,而是从Table2中具有最接近JDAY且小于Table1 JDAY的行中获取WSC值。

did you try this? 你有尝试过吗?

select t1.*,t2.* from table 1 as t1
join table2 as t2
on round(t2.jday)=(t1.jday+1)
group by t1.jday
order by t1.jday;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM