简体   繁体   中英

Grouping to avoid returning rows with the same set of column values between rows where one of those values changes

I've got a table containing post edit messages, with foreign keys relating an error message to its respective post and author:

post_id   author_id  edit_message                            date
1         1          "first author's first edit to post 1"   2018-03-19 12:00:00
1         1          "first author's second edit to post 1"  2018-03-19 12:05:00
2         1          "first author's first edit to post 2"   2018-03-19 12:10:00
1         1          "first author's third edit to post 1"   2018-03-19 12:15:00
1         2          "second author's first edit to post 1"  2018-03-19 12:20:00
1         1          "first author's fourth edit to post 1"  2018-03-19 12:25:00

Sample data:

CREATE TABLE IF NOT EXISTS `post_edits` (
  `post_id` int(6) unsigned NOT NULL,
  `author_id` int(3) unsigned NOT NULL,
  `edit_message` varchar(200) NOT NULL,
  `post_date` DATE NOT NULL
) DEFAULT CHARSET=utf8;
INSERT INTO `post_edits` (`post_id`, `author_id`, `edit_message`, `post_date`) VALUES
  ("1", "1", "first author's first edit to post 1", "2018-03-19 12:00:00"),
  ("1", "1", "first author's second edit to post 1", "2018-03-19 12:05:00"),
  ("2", "1", "first author's first edit to post 2", "2018-03-19 12:10:00"),
  ("1", "1", "first author's third edit to post 1", "2018-03-19 12:15:00"),
  ("1", "2", "second author's first edit to post 1", "2018-03-19 12:20:00"),
  ("1", "1", "first author's fourth edit to post 1", "2018-03-19 12:25:00");

And SQLFiddle of same .

I'd like to get a list of edit messages ordered by date, grouped in such a way that I get only the latest edit message made for a particular post by a particular author, and a count of how many other edit messages there are for the same post and author since another author edited that post or the same author edited another post . The returned rows would look like:

post_id   author_id  edit_messag                             date                 edits_between
1         1          "first author's fourth edit to post 1"  2018-03-19 12:25:00  0
1         2          "second author's first edit to post 1"  2018-03-19 12:20:00  0
1         1          "first author's third edit to post 1"   2018-03-19 12:15:00  0
2         1          "first author's first edit to post 2"   2018-03-19 12:10:00  0
1         1          "first author's second edit to post 1"  2018-03-19 12:05:00  1

Notice that the first row in the database is not returned, because the second row is a newer edit by the same author to the same post. The edits_between column counts how many rows with the same post_id and author_id were not returned due to this criterion. The idea behind this is that I can display a list of recent edit messages like:

Latest edits:
  1. 2018-03-19 12:15:00 to post id 1: "first author's third edit"
  2. 2018-03-19 12:10:00 to post id 1: "second author's first edit"
  3. 2018-03-19 12:05:00 to post id 1: "first author's second edit" (+1 previous)

The (+1 previous) addendum shows that one older message has been skipped.

This is what I've got so far, but it groups the selected rows by post parent and author without any regard to the order in which edits were made by different authors or at different times.

SELECT post_id, MAX(post_date), author_id, COUNT(1) AS edits_between
FROM posts
GROUP BY post_id, author_id
ORDER BY post_date DESC
LIMIT 10

This looks like:

Latest edits:
  1. 2018-03-19 to post id 1: "second author's first edit to post 1" (+1 edit)
  2. 2018-03-19 to post id 1: "first author's first edit to post 1" (+4 edits)
  3  2018-03-19 to post id 2: "first author's first edit to post 2" (+1 edit)

I guess the solution involves some sort of GROUP BY has_the_author_or_post_changed clause, but I don't know how to implement this in SQL.

If you look at the SQLFiddle Query I've setup for you,

GROUP_CONCAT in combination with GROUP BY does more or less of what you need.

GROUP_CONCAT without the SEPARATOR option auto-generates comma-separated values

In order for you to achieve the result that you want, you'd have to do a GROUP_CONCAT in order to list multiple values in one row per your personalized column as seen in columns Posts & EditMessages . The extra step that you'll have is setting up a SubQuery that concatenates multiple queries onto-one so that they GROUP_CONCAT the information you'd like to see compiled together as you can see from my example.

The method shown here is not the complete "working" example YOU will need but instead a "minimal, working" prototype query with the method required in order to best aggregate the values you are joining in the column specified.

As an illustration to this method, you may use:

SELECT GROUP_CONCAT(user_ids SEPARATOR " // ") AS User

in order for you to see what is happening with the fields. I purposely left this out of the sample Query below for the sake of clarity simplicity.

SQL Fiddle SELECT GROUP_CONCAT(user_ids SEPARATOR " // ") AS Users

SQL Fiddle without the Users GROUP_CONCAT

MySQL 5.6 Schema Setup :

CREATE TABLE IF NOT EXISTS post_edits (
  `posts_id` int(6) unsigned NOT NULL,
  `user_ids` int(3) unsigned NOT NULL,
  `edit_message` varchar(200) NOT NULL,
  `posts_texts` varchar(200) NOT NULL,
  `posts_date` DATE NOT NULL) DEFAULT CHARSET=utf8;
INSERT INTO post_edits (
  `posts_id`, 
  `user_ids`, 
  `edit_message`, 
  `posts_texts`, 
  `posts_date`) 
  VALUES
  ("1", "1", " First author's first edit to Post 1 ", " Author 1 Edit 1  ", "2018-03-19 12:00:00"),
  ("1", "1", " First author's second edit to Post 1 ", " Author 1 Edit 2  ", "2018-03-19 12:05:00"),
  ("2", "1", " First author's first edit to Post 2 ", " Author 1 Edit 1  ", "2018-03-19 12:10:00"),
  ("1", "1", " First author's third edit to Post 1 ", " Author 1 Edit 3  ", "2018-03-19 12:15:00"),
  ("1", "2", " Second author's first edit to Post 1 ", " Author 2 Edit 1   ", "2018-03-19 12:20:00"),
  ("1", "1", " First author's fourth edit to Post 1 ", " Author 1 Edit 4  ", "2018-03-19 12:25:00"),
  ("2", "1", " First author's second edit to Post 2 ", " Author 1 Edit 2   ", "2018-03-19 12:45:00"),
  ("2", "2", " Second author's first edit to Post 2 ", " Author 2 Edit 1  ", "2018-03-19 12:55:00"),
  ("2", "2", " Second author's second edit to Post 2 ", " Author 2 Edit 1  ", "2018-03-19 13:05:00"),
  ("1", "2", " Second author's second edit to Post 1 ", " Author 2 Edit 2  ", "2018-03-19 13:20:00");

Query 1 :

SELECT posts_date, 
  posts_id, 
  user_ids,
  GROUP_CONCAT(posts_texts SEPARATOR ' //') AS Posts, 
  GROUP_CONCAT(edit_message SEPARATOR ' //') AS EditMessages,
  GROUP_CONCAT(user_ids SEPARATOR " // ") AS Users,
  COUNT(1) AS edits_between

FROM post_edits

GROUP BY posts_id, user_ids
ORDER BY posts_date DESC, user_ids

Results :

| posts_date | posts_id | user_ids |                                                                             Posts |                                                                                                                                                    EditMessages |            Users | edits_between |
|------------|----------|----------|-----------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|---------------|
| 2018-03-19 |        1 |        1 |  Author 1 Edit 1   // Author 1 Edit 2   // Author 1 Edit 3   // Author 1 Edit 4   |  First author's first edit to Post 1  // First author's second edit to Post 1  // First author's third edit to Post 1  // First author's fourth edit to Post 1  | 1 // 1 // 1 // 1 |             4 |
| 2018-03-19 |        2 |        1 |                                           Author 1 Edit 1   // Author 1 Edit 2    |                                                                                   First author's first edit to Post 2  // First author's second edit to Post 2  |           1 // 1 |             2 |
| 2018-03-19 |        1 |        2 |                                           Author 2 Edit 1    // Author 2 Edit 2   |                                                                                 Second author's first edit to Post 1  // Second author's second edit to Post 1  |           2 // 2 |             2 |
| 2018-03-19 |        2 |        2 |                                            Author 2 Edit 1   // Author 2 Edit 1   |                                                                                 Second author's first edit to Post 2  // Second author's second edit to Post 2  |           2 // 2 |             2 |

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM