简体   繁体   中英

postgresql query to get counts between 12:00 and 12:00

I have the following query that works fine, but it is giving me counts for a single, whole day (00:00 to 23:59 UTC). For example, it's giving me counts for all of January 1 2017 (00:00 to 23:59 UTC).

My dataset lends itself to be queried from 12:00 UTC to 12:00 UTC. For example, I'm looking for all counts from Jan 1 2017 12:00 UTC to Jan 2 2017 12:00 UTC.

Here is my query:

SELECT count(DISTINCT ltg_data.lat), cwa, to_char(time, 'MM/DD/YYYY') 
FROM counties 
JOIN ltg_data on ST_contains(counties.the_geom, ltg_data.ltg_geom) 
WHERE cwa = 'MFR' 
  AND time BETWEEN '1987-06-01' 
  AND '1992-08-1' 
GROUP BY cwa, to_char(time, 'MM/DD/YYYY');

FYI...I'm changing the format of the time so I can use the results more readily in javascript.

And a description of the dataset. It's thousands of point data that occurs within various polygons every second. I'm determining if the points are occurring withing the polygon "cwa = MFR" and then counting them.

Thanks for any help!

I see two approaches here.

first, join generate_series(start_date::timestamp,end_date,'12 hours'::interval) to get count in those generate_series . this would be more correct I believe. But it has a major minus - you have to lateral join it against existing data set to use min(time) and max(time) ...

second, a monkey hack itself, but much less coding and less querying. Use different time zone to make 12:00 a start of the day, eg (you did not give the sample, so I generate content of counties with generate_series with 2 hours interval as sample data):

t=# with counties as (select generate_series('2017-09-01'::timestamptz,'2017-09-04'::timestamptz,'2 hours'::interval)
g)
select count(1),to_char(g,'MM/DD/YYYY') from counties
group by to_char(g,'MM/DD/YYYY')
order by 2;
 count |  to_char   
-------+------------
    12 | 09/01/2017
    12 | 09/02/2017
    12 | 09/03/2017
     1 | 09/04/2017
(4 rows)

so for UTC time zone there are 12 two hours interval rows for days above, due to inclusive nature of generate_series in my sample, 1 row for last days. in general: 37 rows.

Now a monkey hack:

t=# with counties as (select generate_series('2017-09-01'::timestamptz,'2017-09-04'::timestamptz,'2 hours'::interval)
g)
select count(1),to_char(g at time zone 'utc+12','MM/DD/YYYY') from counties
group by to_char(g at time zone 'utc+12','MM/DD/YYYY')
order by 2;
 count |  to_char   
-------+------------
     6 | 08/31/2017
    12 | 09/01/2017
    12 | 09/02/2017
     7 | 09/03/2017
(4 rows)

I select same dates for different time zone, switching it exactly 12 hours, getting first day starting at 31 Aug middday, not 1 Sep midnight, and the count changes, still totalling 37 rows, but grouping your requested way...

update

for your query I'd try smth like:

SELECT count(DISTINCT ltg_data.lat), cwa, to_char(time at time zone 'utc+12', 'MM/DD/YYYY') 
FROM counties 
JOIN ltg_data on ST_contains(counties.the_geom, ltg_data.ltg_geom) 
WHERE cwa = 'MFR' 
  AND time BETWEEN '1987-06-01' 
  AND '1992-08-1' 
GROUP BY cwa, to_char(time at time zone 'utc+12', 'MM/DD/YYYY');

also if you want to apply +12 hours logic to where clause - add at time zone 'utc+12' to "time" comparison as well

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM