I am trying to combine data from 2 tables.
Those 2 tables both contain data from the same sensor (lets say a sensor that measures CO2 with 1 entry per 10 minutes).
The first table contains validated data. Let's call it station1_validated
. The 2nd table contains raw data. Let's call this one station1_nrt
.
While the raw-data table contains live data, the validated table contains only data points that are at least 1 month old. (It needs some time to validate those data and to control it manually afterwards, this happens only once every month).
What I am trying to do now is to combine the data of those 2 tables to display live data on a website. However when validated data is available it should prioritize that data point over the raw data-point.
The relevant columns for this are:
I wrote this basic SQL:
SELECT
*
FROM
(SELECT
timed, CO2, '2' tab
FROM
station1_nrt
WHERE
TIMED >= 1386932400000
AND TIMED <= 1386939600000
AND TIMED NOT IN (SELECT
timed
FROM
station1_nrt
WHERE
CO2 IS NOT NULL
AND TIMED >= 1386932400000
AND TIMED <= 1386939600000) UNION SELECT
timed, CO2, '1' tab
FROM
station1_validated
WHERE
CO2 IS NOT NULL
AND TIMED >= 1386932400000
AND TIMED <= 1386939600000) a
ORDER BY timed
This does not work correctly as it selects only those data points where both tables have an entry. However I want to do this with a JOIN
now as it would be much faster. However I don't know how to a JOIN with a DISTINCT (or something similar) with prioritizing a table. Could someone help me out with this (or explain it?)
You haven't mentioned if there exist records in station1_validated
which don't exist in station1_nrt
so I use FULL JOIN
. If all rows from station1_validated
exist in station1_nrt
then you can use LEFT JOIN instead.
Something like this
SELECT IFNULL(n.timed,v.timed) as timed,
CASE WHEN v.timed IS NOT NULL THEN v.CO2 ELSE n.CO2 END as CO2,
CASE WHEN v.timed IS NOT NULL THEN '1' ELSE '2' END as tab
FROM station1_nrt as n
FULL JOIN station1_validated as v ON n.timed=v.timed AND v.CO2 IS NOT NULL
WHERE
( n.TIMED between 1386932400000 AND 1386939600000
or
v.TIMED between 1386932400000 AND 1386939600000
)
AND
(n.CO2 IS NOT NULL OR v.CO2 IS NOT NULL)
You can join and then use IF
s in the fields to choose the validated values if they exist. Something like:
SELECT
IFNULL(s1val.timed,s1.timed) AS timed,
IFNULL(s1val.C02,s1.C02) AS C02,
2 AS 2,
IFNULL(s1val.tab,s1.tab) AS tab,
FROM
station1_nrt s1
LEFT JOIN station1_validated s1val ON (s1.TIMED = s1val.TIMED)
WHERE
-- Any necessary where clauses
MySQL has an IF
that would probably work for you. You would have to select specific columns, though, but you could build the query programmatically.
SELECT
IF(DATE_SUB(NOW(), INTERVAL 1 MONTH) < FROM_UNIXTIME(nrt.TIMED),
val.value,
nrt.value
) AS value
-- Similar for other values
FROM
station1_nrt AS nrt
JOIN station1_validated AS val USING(id)
ORDER BY TIMED
Note that the USING(id)
is a placeholder. Presumably there is some indexed column you can join the two tables on.
@Jim, @valex, @ExplosionPills I managed to write a SQL select that emulates a FULL OUTER JOIN
(as there is no FULL JOIN in MySQL) and returns the value of the validated data if it exists. If no validated data is available it will return the raw value
So this is the SQL I am using now:
SET @StartTime = 1356998400000;
SET @EndTime = 1386546000000;
SELECT
timed,
IFNULL (mergedData.validatedValue, mergedData.rawValue) as value
FROM
((SELECT
from_unixtime(timed / 1000) as timed,
rawData.NOX as rawValue,
validatedData.NOX as validatedValue
FROM
nabelnrt_bas as rawData
LEFT JOIN nabelvalidated_bas as validatedData using(timed)
WHERE
(rawData.timed > @StartTime
AND rawData.timed < @EndTime)
OR (validatedData.timed > @StartTime
AND validatedData.timed < @EndTime)
) UNION (
SELECT
from_unixtime(timed / 1000) as timed,
rawData.NOX as rawValue,
validatedData.NOX as validatedValue
FROM
nabelnrt_bas as rawData
RIGHT JOIN nabelvalidated_bas as validatedData using(timed)
WHERE
(rawData.timed > @StartTime
AND rawData.timed < @EndTime)
OR (validatedData.timed > @StartTime
AND validatedData.timed < @EndTime)
)
ORDER BY timed DESC) as mergedData
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.