简体   繁体   中英

How to perform optimal FULL OUTER JOIN on more than two MySQL tables?

At first, I would like to say that I already found out that MySQL does not support SQL's FULL OUTER JOIN . However, I need data to be joined that way... Hope I am correct by saying "I need FULL OUTER JOIN". Correct me if I am wrong about that.

What is my case about? One table ( users ) describes users registered to play games. The table results1 describes results of users playing game1 , The table results2 describes results of users playing game2 and so on.
Now I want to write a query that gets list of users and user points from all games in the period of time. Points must be summed. The query must group the results by user_id and date (monthly).

The big problem is that none of tables have full set of months (so I cannot do just LEFT or RIGHT joins). I thought about making some kind of temporary calendar table (just years and months in the period of time) and then join tables with points ( results1 , results2 , results3 , and so on..) to that calendar table. But this kind of solution seems to be quite complicated as well. Any other ideas?

My case (MySQL dump):

-- --------------------------------------------------------
-- Host:                         192.168.0.60
-- Server version:               5.5.40-cll-lve - MySQL Community Server (GPL) by Atomicorp
-- Server OS:                    Linux
-- HeidiSQL Version:             9.1.0.4867
-- --------------------------------------------------------

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET NAMES utf8mb4 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;

-- Dumping database structure for example
CREATE DATABASE IF NOT EXISTS `example` /*!40100 DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci */;
USE `example`;


-- Dumping structure for table example.results1
CREATE TABLE IF NOT EXISTS `results1` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(11) DEFAULT NULL,
  `points` int(11) DEFAULT NULL,
  `date` date DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- Dumping data for table example.results1: ~8 rows (approximately)
/*!40000 ALTER TABLE `results1` DISABLE KEYS */;
INSERT INTO `results1` (`id`, `user_id`, `points`, `date`) VALUES
    (1, 1, 5, '2014-01-17'),
    (2, 1, 5, '2014-01-18'),
    (3, 2, 10, '2014-02-17'),
    (4, 9, 8, '2014-03-17'),
    (5, 1, 15, '2014-07-17'),
    (6, 3, 9, '2014-10-17'),
    (7, 1, 20, '2015-02-17'),
    (8, 5, 10, '2014-06-17');
/*!40000 ALTER TABLE `results1` ENABLE KEYS */;


-- Dumping structure for table example.results2
CREATE TABLE IF NOT EXISTS `results2` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(11) DEFAULT NULL,
  `points` int(11) DEFAULT NULL,
  `date` date DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=COMPACT;

-- Dumping data for table example.results2: ~8 rows (approximately)
/*!40000 ALTER TABLE `results2` DISABLE KEYS */;
INSERT INTO `results2` (`id`, `user_id`, `points`, `date`) VALUES
    (1, 1, 50, '2014-01-01'),
    (2, 2, 35, '2014-01-02'),
    (3, 3, 14, '2014-01-03'),
    (4, 4, 18, '2014-06-01'),
    (5, 5, 16, '2014-06-01'),
    (6, 5, 16, '2014-06-02'),
    (7, 6, 4, '2014-10-29'),
    (8, 1, 20, '2014-01-16');
/*!40000 ALTER TABLE `results2` ENABLE KEYS */;


-- Dumping structure for table example.results3
CREATE TABLE IF NOT EXISTS `results3` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(11) DEFAULT NULL,
  `points` int(11) DEFAULT NULL,
  `date` date DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=COMPACT;

-- Dumping data for table example.results3: ~3 rows (approximately)
/*!40000 ALTER TABLE `results3` DISABLE KEYS */;
INSERT INTO `results3` (`id`, `user_id`, `points`, `date`) VALUES
    (1, 9, 6, '2014-12-17'),
    (2, 1, 10, '2014-01-01'),
    (3, 1, 2, '2014-10-17'),
    (4, 1, 8, '2014-01-03');
/*!40000 ALTER TABLE `results3` ENABLE KEYS */;


-- Dumping structure for table example.results4
CREATE TABLE IF NOT EXISTS `results4` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(11) DEFAULT NULL,
  `points` int(11) DEFAULT NULL,
  `date` date DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci ROW_FORMAT=COMPACT;

-- Dumping data for table example.results4: ~2 rows (approximately)
/*!40000 ALTER TABLE `results4` DISABLE KEYS */;
INSERT INTO `results4` (`id`, `user_id`, `points`, `date`) VALUES
    (1, 4, 41, '2015-03-17'),
    (2, 1, 2, '2014-12-17');
/*!40000 ALTER TABLE `results4` ENABLE KEYS */;


-- Dumping structure for table example.users
CREATE TABLE IF NOT EXISTS `users` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(1000) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

-- Dumping data for table example.users: ~10 rows (approximately)
/*!40000 ALTER TABLE `users` DISABLE KEYS */;
INSERT INTO `users` (`id`, `name`) VALUES
    (1, 'Sophie'),
    (2, 'Joshua'),
    (3, 'Isabelle'),
    (4, 'Jack'),
    (5, 'Emily'),
    (6, 'Harry'),
    (7, 'Olivia'),
    (8, 'Oliver'),
    (9, 'Lily'),
    (10, 'Charlie');
/*!40000 ALTER TABLE `users` ENABLE KEYS */;
/*!40101 SET SQL_MODE=IFNULL(@OLD_SQL_MODE, '') */;
/*!40014 SET FOREIGN_KEY_CHECKS=IF(@OLD_FOREIGN_KEY_CHECKS IS NULL, 1, @OLD_FOREIGN_KEY_CHECKS) */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;

The query I have for joining two tables:

    SELECT

        r1_r2.user_id, r1_r2.points, r1_r2.`date`

    FROM (

        SELECT
                IFNULL(rr1.user_id, rr2.user_id) as user_id, (IFNULL(rr1.points,0) + IFNULL(rr2.points,0)) as points, IFNULL(rr1.`date`, rr2.`date`) as `date`
            FROM (
                SELECT
                        r1.user_id, SUM(r1.points) as points, DATE_FORMAT(r1.`date`, '%Y-%m') as `date`
                    FROM results1 as r1
                    GROUP BY r1.user_id, DATE_FORMAT(r1.`date`, '%Y-%m')
            ) as rr1
            LEFT JOIN (
                SELECT
                        r2.user_id, SUM(r2.points) as points, DATE_FORMAT(r2.`date`, '%Y-%m') as `date`
                    FROM results2 as r2
                    GROUP BY r2.user_id, DATE_FORMAT(r2.`date`, '%Y-%m')
            ) as rr2 ON (rr1.user_id = rr2.user_id AND rr1.`date` = rr2.`date`)


        UNION


        SELECT
                IFNULL(rl1.user_id, rl2.user_id) as user_id, (IFNULL(rl1.points,0) + IFNULL(rl2.points,0)) as points, IFNULL(rl1.`date`, rl2.`date`) as `date`
            FROM (
                SELECT
                        r1.user_id, SUM(r1.points) as points, DATE_FORMAT(r1.`date`, '%Y-%m') as `date`
                    FROM results1 as r1
                    GROUP BY r1.user_id, DATE_FORMAT(r1.`date`, '%Y-%m')
            ) as rl1
            RIGHT JOIN (
                SELECT
                        r2.user_id, SUM(r2.points) as points, DATE_FORMAT(r2.`date`, '%Y-%m') as `date`
                    FROM results2 as r2
                    GROUP BY r2.user_id, DATE_FORMAT(r2.`date`, '%Y-%m')
            ) as rl2 ON (rl1.user_id = rl2.user_id AND rl1.`date` = rl2.`date`)

    ) as r1_r2

HAVING
    r1_r2.`date` BETWEEN '2014-01' AND '2014-12'
ORDER BY 
    r1_r2.user_id ASC, r1_r2.`date` ASC

It works for two tables ( results1 and results2 ), but the problem is that I need more than two tables to be joined the same way...

I have some kind of solution for that (like nesting tables again and again..), but the problem is that the solution of mine becomes very complicated (long in length, very complicated in reading and understanding) as well. Also, there are chances on some additional tables coming up in the very near future. How the query will look like after additional 3 or 5 tables will add? If I will continue doing the same kind of nested joins the whole query will become more and more complicated to read, understand, modify...

Here is the query for 3 tables joined ( results1 , results2 , results3 ):

SELECT

        r1_r2_r3.user_id, r1_r2_r3.points, r1_r2_r3.`date`

    FROM (

        SELECT

                IFNULL(r1_r2_l.user_id, r3_l.user_id) as user_id, (IFNULL(r1_r2_l.points,0) + IFNULL(r3_l.points,0)) as points, IFNULL(r1_r2_l.`date`, r3_l.`date`) as `date`

            FROM (

                # BEGIN. RESULT FROM BEFORE

                SELECT

                    r1_r2.user_id, r1_r2.points, r1_r2.`date`

                FROM (

                    SELECT
                            IFNULL(rr1.user_id, rr2.user_id) as user_id, (IFNULL(rr1.points,0) + IFNULL(rr2.points,0)) as points, IFNULL(rr1.`date`, rr2.`date`) as `date`
                        FROM (
                            SELECT
                                    r1.user_id, SUM(r1.points) as points, DATE_FORMAT(r1.`date`, '%Y-%m') as `date`
                                FROM results1 as r1
                                GROUP BY r1.user_id, DATE_FORMAT(r1.`date`, '%Y-%m')
                        ) as rr1
                        LEFT JOIN (
                            SELECT
                                    r2.user_id, SUM(r2.points) as points, DATE_FORMAT(r2.`date`, '%Y-%m') as `date`
                                FROM results2 as r2
                                GROUP BY r2.user_id, DATE_FORMAT(r2.`date`, '%Y-%m')
                        ) as rr2 ON (rr1.user_id = rr2.user_id AND rr1.`date` = rr2.`date`)


                    UNION


                    SELECT
                            IFNULL(rl1.user_id, rl2.user_id) as user_id, (IFNULL(rl1.points,0) + IFNULL(rl2.points,0)) as points, IFNULL(rl1.`date`, rl2.`date`) as `date`
                        FROM (
                            SELECT
                                    r1.user_id, SUM(r1.points) as points, DATE_FORMAT(r1.`date`, '%Y-%m') as `date`
                                FROM results1 as r1
                                GROUP BY r1.user_id, DATE_FORMAT(r1.`date`, '%Y-%m')
                        ) as rl1
                        RIGHT JOIN (
                            SELECT
                                    r2.user_id, SUM(r2.points) as points, DATE_FORMAT(r2.`date`, '%Y-%m') as `date`
                                FROM results2 as r2
                                GROUP BY r2.user_id, DATE_FORMAT(r2.`date`, '%Y-%m')
                        ) as rl2 ON (rl1.user_id = rl2.user_id AND rl1.`date` = rl2.`date`)
                ) as r1_r2
                # END. RESULT FROM BEFORE
            ) as r1_r2_l
            LEFT JOIN (
                SELECT
                        r3.user_id, SUM(r3.points) as points, DATE_FORMAT(r3.`date`, '%Y-%m') as `date`
                    FROM results3 as r3
                    GROUP BY r3.user_id, DATE_FORMAT(r3.`date`, '%Y-%m')
            ) as r3_l ON (r1_r2_l.user_id = r3_l.user_id AND r1_r2_l.`date` = r3_l.`date`)


    UNION


        SELECT

                IFNULL(r1_r2_r.user_id, r3_r.user_id) as user_id, (IFNULL(r1_r2_r.points,0) + IFNULL(r3_r.points,0)) as points, IFNULL(r1_r2_r.`date`, r3_r.`date`) as `date`

            FROM (

                # BEGIN. RESULT FROM BEFORE

                SELECT

                    r1_r2.user_id, r1_r2.points, r1_r2.`date`

                FROM (

                    SELECT
                            IFNULL(rr1.user_id, rr2.user_id) as user_id, (IFNULL(rr1.points,0) + IFNULL(rr2.points,0)) as points, IFNULL(rr1.`date`, rr2.`date`) as `date`
                        FROM (
                            SELECT
                                    r1.user_id, SUM(r1.points) as points, DATE_FORMAT(r1.`date`, '%Y-%m') as `date`
                                FROM results1 as r1
                                GROUP BY r1.user_id, DATE_FORMAT(r1.`date`, '%Y-%m')
                        ) as rr1
                        LEFT JOIN (
                            SELECT
                                    r2.user_id, SUM(r2.points) as points, DATE_FORMAT(r2.`date`, '%Y-%m') as `date`
                                FROM results2 as r2
                                GROUP BY r2.user_id, DATE_FORMAT(r2.`date`, '%Y-%m')
                        ) as rr2 ON (rr1.user_id = rr2.user_id AND rr1.`date` = rr2.`date`)


                    UNION


                    SELECT
                            IFNULL(rl1.user_id, rl2.user_id) as user_id, (IFNULL(rl1.points,0) + IFNULL(rl2.points,0)) as points, IFNULL(rl1.`date`, rl2.`date`) as `date`
                        FROM (
                            SELECT
                                    r1.user_id, SUM(r1.points) as points, DATE_FORMAT(r1.`date`, '%Y-%m') as `date`
                                FROM results1 as r1
                                GROUP BY r1.user_id, DATE_FORMAT(r1.`date`, '%Y-%m')
                        ) as rl1
                        RIGHT JOIN (
                            SELECT
                                    r2.user_id, SUM(r2.points) as points, DATE_FORMAT(r2.`date`, '%Y-%m') as `date`
                                FROM results2 as r2
                                GROUP BY r2.user_id, DATE_FORMAT(r2.`date`, '%Y-%m')
                        ) as rl2 ON (rl1.user_id = rl2.user_id AND rl1.`date` = rl2.`date`)
                ) as r1_r2
                # END. RESULT FROM BEFORE
            ) as r1_r2_r
            RIGHT JOIN (
                SELECT
                        r3.user_id, SUM(r3.points) as points, DATE_FORMAT(r3.`date`, '%Y-%m') as `date`
                    FROM results3 as r3
                    GROUP BY r3.user_id, DATE_FORMAT(r3.`date`, '%Y-%m')
            ) as r3_r ON (r1_r2_r.user_id = r3_r.user_id AND r1_r2_r.`date` = r3_r.`date`)


    ) as r1_r2_r3

HAVING
    r1_r2_r3.`date` BETWEEN '2014-01' AND '2014-12'
ORDER BY 
    r1_r2_r3.user_id ASC, r1_r2_r3.`date` ASC

...I think you could get what do I mean by saying that query becomes very complicated to understand if we continue with more tables joined the same way.

By the way, this is just a simplified version of a real situation. In reality, results1 , results2 , results3 and results4 are tables already got by joining other tables, calculating values in between... So the final query I have to work on is much more complicated than mentioned in the examples above.

The question of mine would be: Can I make the query for joining more than two tables shorter, easier to understand?

I think you can do what you want using union all and aggregation. I think the following does what you want:

select user_id, year(date), month(date), sum(points1) as point31,
       sum(points2) as points2, sum(points3) as points3
from users u left join
     ((select r1.user_id, r1.date, r1.points as points1, NULL as points2, NULL as points3
       from results1 r1
      ) union all
      (select r2.user_id, r2.date, NULL as points1, r2.points as points2, NULL as points3
       from results2 r2
      ) union all
      (select r3.user_id, r3.date, NULL as points1, NULL as points2, r3.points as points3
       from results3 r3
      ) 
     ) r
     on u.id = r.user_id
group by user_id, year(date), month(date);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM