简体   繁体   中英

How to join two tables without duplicates

I have an income table that looks like this:

date              income      
---------------------------
09/05/13          56000    
09/05/13          66600
09/05/13          50000

And an expense table that looks like this:

date              expense 
----------------------------
09/05/13          68800

I want to write a query whose output looks like this:

date              income             expense 
---------------------------------------------
09/05/13          56000              68800
09/05/13          66600
09/05/13          50000

with each value from income.income and each value from expense.expense appearing only once. (If I do a simple join, then each one will appear three times, since income.date and expense.date have duplicate values.)

If you try just like this without any unique id's, then your full concept is wrong. Add some unique id's to your table and do the necessary coding.

If you have table structure like below your can write query with simple equi join. Income_tbl:

date              income  id    
---------------------------
09/05/13          56000   1 
09/05/13          66600   2
09/05/13          50000   3

Expense_tbl:

date              expense  id
----------------------------
09/05/13          68800    1
09/05/13                   2
09/05/13                   3

(or) try @Brian Hoover's query it will work.

 SELECT income.date_col, income.income, expense.expense
FROM (
        SELECT i.date_col, i.income, @curRow := @curRow + 1 AS row_number
        FROM Income_tbl i
        JOIN (SELECT @curRow := 0) r
     ) AS income
JOIN (
        SELECT e.date_col, e.expense, @curExpenseRow := @curExpenseRow + 1 AS row_number
        FROM Expense_tbl e
        JOIN (SELECT @curExpenseRow := 0) r
     ) AS expense
ON income.row_number = expense.row_number;      

The easiest way to do this is to calculate the sum (grouped by date) and join them. You'll need three queries to represent the three data sets you have: A dates set, an income set and an expenses set.

First: The dates set

select distinct `date`
from (select `date` from income union select `date` from expense)

Second: The income set:

select `date`, sum(i.income) as income
from income as i
group by `date`

Third: The expenses set:

select `date`, sum(e.expense) as expense
from expense as e
group by `date`

Finally: Put it all together:

select 
    d.date, i.income, e.expense
from
    (
        select distinct `date` 
        from (select `date` from income union select `date` from expense)
    ) as d
    left join (
        select `date`, sum(i.income) as income
        from income as i
        group by `date`
    ) as i on d.`date` = i.`date`
    left join (
        select `date`, sum(e.expense) as expense
        from expense as e
        group by `date`
    ) as e on d.`date` = e.`date`

This will give you the output you asked for, but it's REALLY brittle.

SELECT income.date, income.income, expense.expense
FROM (
SELECT i.date, i.income, @curRow := @curRow + 1 AS row_number
FROM income i
JOIN (SELECT @curRow := 0) r) AS income
JOIN (
SELECT e.date, e.expense, @curExpenseRow := @curExpenseRow + 1 AS row_number
FROM expense e
JOIN (SELECT @curExpenseRow := 0) r) AS expense
ON income.row_number = expense.row_number

SQL Fiddle

It's assuming that you have the same number of rows for income and expenses and the order of the income and expenses are the same, so it's merging the results by row number.

What you really need is some sort of way to join the income and expenses, maybe a ledger entry ID or something on each row. This would give you a definitive join, compared to the hacked join that is in the query I posted.

If you want individual entries, you need something like this:

select date, income, expense
from
  (
    select
      if(@iId_last = incomeId, null, a.income) as income,
      if(@eId_last = expenseId, null, a.expense) as expense,
      a.date,
      @iId_last := a.incomeId as incomeId,
      @eId_last := a.expenseId as expenseId
    from 
      (select @d := '0000-00-00', @iId_last := -1, @eId_last := -1) as init,
      (
        select 
          d.*, 
          coalesce(i.incomeId,0) as incomeId, income, 
          coalesce(e.expenseId,0) as expenseId, expense
        from
          (
            select distinct date from (select incomeDate as date from income union select expenseDate as date from expense) as d
          ) as d
          left join income as i on d.date = i.incomeDate
          left join expense as e on d.date = e.expenseDate
        order by d.date, incomeId, expenseId
      ) as a
    order by date, incomeId, expenseId
  ) as r;

Notice that this solution requires that both income and expense tables have an Id. Check SQL Fiddle solution (I've tested with some combinations: Many income rows and one expense row, one income row and many expenses rows, no income rows and some expenses rows, one income row and no expense rows).

If you're not concerned with particular order in which income values are matched to expense values you can get your desired output with a query like this

SELECT date, 
       MAX(CASE WHEN type = 1 THEN amount END) income,
       MAX(CASE WHEN type = 2 THEN amount END) expense
  FROM
(
  SELECT 1 type, date, income amount, @n := IF(@g = date, @n + 1, 1) rnum, @g := date g
    FROM income CROSS JOIN (SELECT @n := 0, @g := NULL) i1
   UNION ALL
  SELECT 2, date, expense amount, @m := IF(@f = date, @m + 1, 1) rnum, @f := date g
    FROM expense CROSS JOIN (SELECT @m := 0, @f := NULL) i2
) q
 GROUP BY date, rnum

Output:

|               DATE | INCOME | EXPENSE |
|--------------------|--------|---------|
| September, 05 2013 |  56000 |   68800 |
| September, 05 2013 |  66600 |  (null) |
| September, 05 2013 |  50000 |  (null) |

Here is SQLFiddle demo

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM