Given a table of:
invitations (id, created_at, type, email)
Given the following query:
SELECT DISTINCT email_address
FROM invitations
WHERE type = 'email'
AND date(created_at) in (curdate() - interval 3 day,
curdate() - interval 8 day,
curdate() - interval 13 day,
curdate() - interval 21 day,
curdate() - interval 34 day,
curdate() - interval 50 day);
Given my query above, I get several hundred records. What I would like is to get the records grouped by email, meaning a unique email should exist at most once.
I therefore added:
GROUP BY email_address
This results in only 2 records when I run the query... Am I using GROUP BY
incorrectly?
There's no point doing SELECT DISTINCT email ... GROUP BY email
- use either the DISTINCT or the GROUP BY to achieve the same result
If you got two email addresses out as a result, then the set of records selected only contained two unique email addresses
GROUP BY is used to aggregate data where some combination of columns is the same. Any other columns outside this combination must be contained within an agregation function like AVG, MAX, SUM etc
An example:
SELECT city, ethnicity, MAX(age), AVG(salary)
FROM people
GROUP BY city, ethnicity
It will present a unique set of city/ethnicity pairs, together with the oldest age and the average pay for the group
DISTINCT is essentially less useful than a GROUP BY, as it doesn't let you run aggregation functions on other columns; it jsut returns a unique set of records across all selected columns. I consider it a convenience equivalent:
SELECT DISTINCT city, ethnicity
FROM people
SELECT city, ethnicity
FROM people
GROUP BY city, ethnicity --more to type
I'm assuming here that you want the records group-concatenated as such, or similar, based on your inclusion of the created_at field:
SELECT email, GROUP_CONCAT( DISTINCT created_at ORDER BY created_at DESC ) AS list_of_dates FROM invitations GROUP BY email;
to produce something along the lines of:
email | list_of_dates
==========================================================
jenny@hotmail.com | 01/26/2019, 01/23/2019
jonny@gmail.com | 01/23/2019, 01/18/2019, 01/13/2019
greg@yahoo.com | 01/13/2019
, or alternatively:
SELECT created_at, GROUP_CONCAT( DISTINCT email ORDER BY created_at DESC SEPARATOR '; ' ) AS list_of_dates FROM invitations GROUP BY created_at;
created_at | list_of_emails
====================================
01/26/2019 | jenny@hotmail.com
01/23/2019 | jenny@hotmail.com; jonny@gmail.com
01/18/2019 | jonny@gmail.com
01/13/2019 | jonny@gmail.com; greg@yahoo.com
In the context of GROUP_CONCAT(), a GROUP BY is optional and is used to further partition the groupings beyond a single-line aggregate. If no GROUP BY clause is specified, it'll concatenate everything to a single line.
Depending on how many records/how long the concatenation might become, you can adjust the max value by setting the session variable:
SET SESSION group_concat_max_len=<max length>
Either way - use GROUP BY on whichever field you want to summarize and include the others inside GROUP_CONCAT, hopefully this helps.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.