简体   繁体   中英

Getting sum of duplicated date entries from last month, week, day

I have a table with 3 columns: id , updated_at , click_sum .

Many rows have the exact same updated_at value which makes it hard to simply retrieve the data, order by updated_at and display the sums in a chart. Since there are multiple sums for the same dates which screws the chart.

What I try to achieve is to get the following output:

 update_at | click_sum
-----------+-----------
   date1   |    100
   date2   |     3
   date3   |    235
   date4   |    231

Optionally only those dates which are form the last month, week or day AND not simply the dates which are NOW() - 1 month.

The current query I build is very large and doesn't work that well. It groups by dates (no duplicated dates appear) and SUM() s the clicks correctly but defining from when (last month, week, day) the dates are doesn't seem to work properly.

Query: ( $interval stands for MONTH or DAY or SECOND or WEEK )

SELECT d.updated_at, SUM(d.clicks_sum) AS click_sum
FROM aggregated_clicks d
JOIN 
(
     SELECT c.id, MAX(StartOfChains.updated_at) AS ChainStartTime
     FROM aggregated_clicks c
     JOIN 
     (
         SELECT DISTINCT a.updated_at
         FROM aggregated_clicks a
         LEFT JOIN aggregated_clicks b ON (b.updated_at >= a.updated_at - INTERVAL 1 DAY AND b.updated_at < a.updated_at)
         WHERE b.updated_at IS NULL
      ) StartOfChains  ON c.updated_at >= StartOfChains.updated_at
     GROUP BY c.id
) GroupingQuery
ON d.id = GroupingQuery.id
WHERE GroupingQuery.ChainStartTime >= DATE_SUB(NOW(), INTERVAL 1 $interval)
GROUP BY GroupingQuery.ChainStartTime
ORDER BY GroupingQuery.ChainStartTime ASC

maybe I'm assuming too much about the nature of your question (and the table it refers to), but I think this can be done much more simply than the query you've shown.

figuring the latest completed month isn't very hard.

it starts with knowing the first date of this current month -- use this:

date_sub(curdate(), interval (extract(day from curdate())-1) day)

and to know the first day of that previous month, use this:

date_sub(date_sub(curdate(), interval extract(day from (curdate())-1) day), interval 1 month)

so if you want to get the sums for just the days in between -- ie the latest completed month, use this:

select updated_at, sum(click_sum) from aggregated_clicks
  where updated_at >= date_sub(date_sub(curdate(), interval extract(day from (curdate())-1) day), interval 1 month)
    and updated_at < date_sub(curdate(), interval (extract(day from curdate())-1) day)
  group by updated_at;

figuring the lastest completed week is just as easy. this example will assume a Sunday-Saturday week.

because of the way the ODBC standard defines date numbers, it's easy to find the end (Saturday) of the previous week:

date_sub(curdate(), interval dayofweek(curdate()) day)

and the beginning (Sunday) of that week is six days before that:

date_sub(curdate(), interval (dayofweek(curdate())+6) day)

so if you want to get the sums for just the days in between -- ie the latest completed week, use this:

select updated_at, sum(click_sum) from aggregated_clicks
  where updated_at >= date_sub(curdate(), interval (dayofweek(curdate())+6) day)
    and updated_at <= date_sub(curdate(), interval dayofweek(curdate()) day)
  group by updated_at;

and of course figuring based on the latest completed day is super easy.

to get the date of the previous day, use this:

date_sub(curdate(), interval 1 day)

so if you want the sums just for yesterday, use this:

select updated_at, sum(click_sum) from aggregated_clicks
  where updated_at = date_sub(curdate(), interval 1 day)
  group by updated_at;

NOTE: I've tested these queries using MySQL 5.1, YMMV.

----------

UPDATE: since the date column is a datetime, simply change all references to updated_at in my queries to date(updated_at) like so:

month case:

select date(updated_at), sum(click_sum) from aggregated_clicks
  where date(updated_at) >= date_sub(date_sub(curdate(), interval extract(day from (curdate())-1) day), interval 1 month)
    and date(updated_at) < date_sub(curdate(), interval (extract(day from curdate())-1) day)
  group by date(updated_at);

week case:

select date(updated_at), sum(click_sum) from aggregated_clicks
  where date(updated_at) >= date_sub(curdate(), interval (dayofweek(curdate())+6) day)
    and date(updated_at) <= date_sub(curdate(), interval dayofweek(curdate()) day)
  group by date(updated_at);

yesterday case:

select date(updated_at), sum(click_sum) from aggregated_clicks
  where date(updated_at) = date_sub(curdate(), interval 1 day)
  group by date(updated_at);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM