I have a Postgres table like this, with device ID, timestamp, and the status of the device at that time:
dev_id | timestamp | status
----------------------------------------
1 | 2020-08-06 23:00:00 | 1
2 | 2020-08-06 23:00:00 | 0
3 | 2020-08-06 23:00:00 | 1
2 | 2020-08-06 23:05:00 | 1
3 | 2020-08-06 23:05:00 | 0
1 | 2020-08-06 23:10:00 | 0
I want to see in their respective latest timestamp, how many of devices were functioning and how many not functioning. In Postgres, I can use DISTINCT ON
and write the query like this:
SELECT status, COUNT(status)
FROM
(
SELECT DISTINCT ON (dev_id) dev_id,
timestamp,
status
FROM
sample_metrics_data
ORDER BY
dev_id,
timestamp DESC
) sub
GROUP BY status;
This will result in:
value | count
---------------
0 | 2
1 | 1
(2 devices, #1 & #3, have a status of 0, while 1, #2, has a status of 1.) How can I create something like this in CubeJS? Is DISTINCT ON
supported, and if not, what is the way around it?
Alternatively, the query can be written using inner join:
SELECT status,
Count(status)
FROM sample_metrics_data
JOIN (SELECT dev_id id,
Max(timestamp) ts
FROM sample_metrics_data
GROUP BY dev_id) max_ts
ON timestamp = max_ts.ts
AND dev_id = max_ts.id
GROUP BY status;
I would need to do an inner join, but it seems only LEFT JOIN is available.
In your case, if you need to build a graph of how many devices were online, then a typical solution to your problem would be
For example, I made a table as in your question
And create this cube
cube(`SampleMetricsData`, {
sql: "SELECT *, device_status - COALESCE(LAG(device_status) OVER (PARTITION BY id ORDER BY timemark ASC), 0) as rolling_status FROM ab_api_test.sample_metrics ORDER BY `sample_metrics`.`timemark` DESC",
measures: {
rollingStatusTotal: {
sql: `rolling_status`,
type: `sum`,
rollingWindow: {
trailing: `unbounded`,
},
},
},
dimensions: {
id: {
sql: `id`,
type: `number`,
primaryKey: true
},
timemark: {
sql: `timemark`,
type: `time`
},
}
});
On this cube you can see online device chart with this query
{"measures":["SampleMetricsData.rollingStatusTotal"],"timeDimensions":[{"dimension":"SampleMetricsData.timemark","granularity":"hour","dateRange":"This month"}],"order":{},"dimensions":[],"filters":[]}
Possibly you should see this tutorial , It looks like something similar for your task. And one more related question is here
You can also write a query like this to create a cube from your data. But this is not best practices
select * from (
SELECT DISTINCT ON (dev_id) dev_id,
timestamp,
status
FROM
sample_metrics_data
ORDER BY
dev_id,
timestamp DESC
) as sample_metrics
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.