简体   繁体   中英

Mysql replace column value with other column value

I have 2 tables:

table: transaction:
====================
id  billed_date   amount
 1  2016-09-30      5
 2  2016-10-04      15
 3  2016-10-06      10

table: report_date
====================
transaction_id    report_date
      1            2016-10-01

I want:

  • Create a report which sum all transactions's amount in October 2016
  • Base on report date, not billed date
  • When report date is not set, it's base on billed_date
  • In above example, I want result is 30 (not 25)

Then I write:

The First:

SELECT 
   sum(t.amount), 
   CASE WHEN d.report_date IS NOT NULL THEN d.report_date ELSE t.billed_date END AS new_date
FROM 
   transaction t LEFT JOIN report_date d ON t.id = d.transaction_id
WHERE new_date BETWEEN '2016-10-01' AND '2016-10-30'

The Second:

SELECT sum(amount) FROM 
 (SELECT t.amount,
    CASE WHEN d.report_date IS NOT NULL THEN d.report_date ELSE t.billed_date END AS date
    FROM transaction t LEFT JOIN report_date d ON t.id = d.transaction_id
 ) t
WHERE t.date BETWEEN '2016-10-01' AND '2016-10-30'

Result:

The First:

  • Unknown column 'new_date' in 'where clause'
  • If I replace 'new_date' by 'date': result = 25 (exclude id=1)

The Second:

  • result = 30 => Correct, but in my case, when transaction table have about 30k records, the process is too slow.

Anybody can help me?

I finally find out the solution with the help from my brother:

SELECT sum(amount)
FROM transaction t LEFT JOIN report_date d ON id = transaction_id 
WHERE (report_date BETWEEN '2016-10-01' AND '2016-10-30') OR (report_date IS NULL AND billed_date BETWEEN '2016-10-01' AND '2016-10-30')

Thank you for caring me!

First of all - the part

CASE WHEN d.report_date IS NOT NULL THEN d.report_date ELSE t.billed_date END

can be written shorter as

COALESCE(d.report_date, t.billed_date)

or as

IFNULL(d.report_date, t.billed_date)

In your first query you are using a column alias in the WHERE clause, wich is not allowed. You can fix it by moving the expression behind the alias to the WHERE clause:

SELECT sum(t.amount)
FROM transaction t LEFT JOIN report_date d ON t.id = d.transaction_id
WHERE COALESCE(d.report_date, t.billed_date) BETWEEN '2016-10-01' AND '2016-10-30'

This is almost the same as your own solution.

Your second query is slow because MySQL has to store the subquery result (30K rows) into a temporary table. Trying to optimize it, you will end up with the same solution above.

However if you have indexes on transaction.billed_date and report_date.report_date this query still can not use them. In order to use the indexes, you can split the query into two parts:

Entries with a report (will use report_date.report_date index):

SELECT sum(amount)
FROM transaction t JOIN report_date d ON id = transaction_id 
WHERE d.report_date BETWEEN '2016-10-01' AND '2016-10-30'

Entries without a report (will use transaction.billed_date index):

SELECT sum(amount)
FROM transaction t LEFT JOIN report_date d ON id = transaction_id 
WHERE d.report_date IS NULL AND t.billed_dateBETWEEN '2016-10-01' AND '2016-10-30'

Both queries can use an index. You just need to sum the results, wich can also be done combining the two queries:

SELECT (
    SELECT sum(amount)
    FROM transaction t JOIN report_date d ON id = transaction_id 
    WHERE d.report_date BETWEEN '2016-10-01' AND '2016-10-30'
) + (
    SELECT sum(amount)
    FROM transaction t LEFT JOIN report_date d ON id = transaction_id 
    WHERE d.report_date IS NULL AND t.billed_dateBETWEEN '2016-10-01' AND '2016-10-30'
) AS sum_amount

Is fill table: report_date with absent values from table: transaction: the case?

SELECT id FROM report_date WHERE report_date BETWEEN '2016-10-01' AND '2016-10-30';

INSERT INTO report_date SELECT id, billed_date FROM transaction WHERE billed_date BETWEEN '2016-10-01' AND '2016-10-30' AND id NOT IN (ids_from previous_query);

SELECT sum(t.amount) FROM transaction LEFT JOIN report_date d ON (t.id = d.transaction_id) WHERE d.report_date BETWEEN '2016-10-01' AND '2016-10-30';

Your Second Query is correct,no need to re-write query. But I have one thing to tell you, which will help you a lot when dealing with thousand/millions of records. We have focus on some other things too. Because when your table contains large amount of data(in thousands and millions) of records then it takes time to execute query. It may causes locking also, might be query lock or database gone away kind of issue. To avoid this issue,you just create INDEX of one column. Create INDEX on that column which act/use on where clauses. Like in your case you can create INDEX on billed_date column from transaction table. Because your result is based on transaction table. For more details how to create index in mysql/phpmyadmin you can take reference from this http://www.yourwebskills.com/dbphpmyadmintable.php link.

I had been faced same issue at some point of time then I created INDEX on column. Now I am dealing with millions of records using mysql.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM