简体   繁体   中英

MySQL GROUP BY with MIN - incorrect column data

I have looked here: Selecting all corresponding fields using MAX and GROUP BY and similar pages on SO but I cannot seem to get all my fields to line up properly.

I feel like I'm at the cusp of figuring this out but maybe I'm heading down the wrong path and need to look at this differently.

What I want is the unit with the lowest rent per property name per bedroom count that have the merge flag set to 1.

My SQL Fiddle: http://sqlfiddle.com/#!2/881c41/2

所有合并的租赁单位= 1查询结果

The image above was obtained with this query:

SELECT ru.id, run.name, ru.rent, ru.bedrooms
FROM rental_units AS ru
JOIN rental_unit_names AS run
on run.id = ru.name_id
WHERE run.merge = 1
ORDER BY run.name ASC, ru.bedrooms ASC, ru.rent ASC

合并= 1的出租单元按物业名称分组,卧室按最小值查询结果分组

The image above is the result of this query:

SELECT ru.id, run.name, ru.rent, MIN(ru.rent) AS min_rent, ru.bedrooms
FROM rental_units AS ru
JOIN rental_unit_names AS run
on run.id = ru.name_id
WHERE run.merge = 1
GROUP BY ru.name_id, ru.bedrooms
ORDER BY run.name ASC, ru.bedrooms ASC, ru.rent ASC, ru.id ASC

For the most part all looks fine and dandy until you look at row 4. The rent values do not line up and the id should be 6 not 5 .

The image below is my desired result.

理想的结果

:: EDIT 1 ::

Do I need to create a linking table with 2 columns that has the rental unit id in one column and the rental unit name id in the other column? Or at least do this as a derived table somehow?

In general, unless you're trying to perform some sort of MySQL "magic" you should always group by every non-aggregate, non-constant column in your SELECT list.

In your case, the best approach is to get a list of (name, # bedrooms, minimum rent), and then find all the rows that match these values - in other words, all rows whose (name, # bedrooms, rent) match the list with the minimum rent:

SELECT ru.id, run.name, ru.rent, ru.bedrooms
FROM rental_units ru
JOIN rental_unit_names run ON run.id = ru.name_id
WHERE run.merge = 1
  AND (run.name, ru.bedrooms, ru.rent) IN (
    SELECT inrun.name, inru.bedrooms, MIN(inru.rent)
    FROM rental_units inru
    JOIN rental_unit_names inrun ON inrun.id = inru.name_id
    WHERE inrun.merge = 1
    GROUP BY inrun.name, inru.bedrooms)

This query will give all lowest-rent units by name/bedrooms. The sample data has ties for lowest in a couple of places. To include only one of the "tied" rows (the one with the lowest rental_units.id , try this instead - the only change is the MIN(ru.id) on the first line and the addition of an overall GROUP BY on the last line:

SELECT MIN(ru.id) AS ru_id, run.name, ru.rent, ru.bedrooms
FROM rental_units ru
JOIN rental_unit_names run ON run.id = ru.name_id
WHERE run.merge = 1
  AND (run.name, ru.bedrooms, ru.rent) IN (
    SELECT inrun.name, inru.bedrooms, MIN(inru.rent)
    FROM rental_units inru
    JOIN rental_unit_names inrun ON inrun.id = inru.name_id
    WHERE inrun.merge = 1
    GROUP BY inrun.name, inru.bedrooms)
GROUP BY run.name, ru.rent, ru.bedrooms

That is because the columns not included in the group by come from indeterminate rows. MySQL documentation is very clear on this point:

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause.

Because I just answered this question on another post , I'm going to suggest that you look there.

EDIT:

Here is how you apply the substring_index() / group_concat() method to your query:

SELECT substring_index(group_concat(ru.id order by rent), ',', 1) as id,
       run.name, MIN(ru.rent) AS min_rent, ru.bedrooms
FROM rental_units ru JOIN
     rental_unit_names run
     on run.id = ru.name_id
WHERE run.merge = 1
GROUP BY ru.name_id, ru.bedrooms
ORDER BY run.name ASC, ru.bedrooms ASC, ru.rent ASC, ru.id ASC
SELECT min(ru.id) as id, run.name, ru.rent, ru.rent AS min_rent, ru.bedrooms
FROM rental_units AS ru
JOIN rental_unit_names AS run
on run.id = ru.name_id
WHERE run.merge = 1
and ru.rent = 
(select min(ru1.rent) from rental_units AS ru1
JOIN rental_unit_names AS run1
on run1.id = ru1.name_id
where run.name = run1.name
and ru.bedrooms = ru1.bedrooms
and run1.merge = 1)
group by run.name, ru.rent,min_rent, ru.bedrooms
ORDER BY run.name ASC, ru.bedrooms ASC, ru.rent ASC, ru.id ASC;

Works PERFECT..!!

Your query gives wrong results for the reason explained in mysql group-by extensions .

You can try putting your group by part in a subquery, then join back to the same table to get other hidden columns you might need (like id) and finally join to names table to get room name. You resolve ties by using lowest id for the self join.

SELECT ro.id, run.name, ro.rent, ro.bedrooms
FROM 
( SELECT name_id, bedrooms, MIN(rent) AS cheapest_rent
  FROM rental_units 
  GROUP BY name_id, bedrooms ) AS ru
JOIN rental_units ro
ON ro.id = ( SELECT ri.id FROM rental_units ri
              WHERE ri.name_id = ru.name_id
              AND ri.bedrooms = ru.bedrooms
              AND ri.rent = ru.cheapest_rent
              ORDER BY ri.name_id, ri.bedrooms, ri.rent, ri.id
              LIMIT 1 )
JOIN rental_unit_names run ON ro.name_id = run.id
WHERE run.merge = 1
ORDER BY run.name ASC, ro.bedrooms ASC, ro.rent ASC

Sqlfiddle here .

Note the slight change in the schema, I added an index on ( name_id, bedrooms, rent ) to help both with grouping and self-join (check execution plan on sqlfiddle) , though due to how mysql optimizer works, to use it for the join this awkward order by inside join condition is required. This is a fast solution for even quite a big table. You might also consider adding index on merge if it is selective enough.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM