简体   繁体   English

在MySQL中返回MAX函数的相应列

[英]Returning corresponding columns of a MAX function in MySQL

I have a table that contains the scores bowled by players in a bowling center. 我有一张桌子,其中包含保龄球中心球员的分数。 Each row has some data about which player bowled the game, which league it was played in, the date, the score of a single game, the lane number, etc. 每一行都有一些关于哪个玩家击败游戏,玩哪个联赛,日期,单个游戏的得分,车道号等的数据。

What I'm trying to do is get who played (and in which league and on what date... Basically the whole row) the best series (three games) on every single lane. 我想要做的就是让每个球道上的最佳系列赛(三场比赛)得分(以及在哪个联赛中以及在哪个日期...基本上整排)。

What I have so far is 到目前为止我所拥有的是什么

SELECT PlayerID, LaneNumber, MAX(Series)
  FROM (SELECT Season, LeagueName, LaneNumber, WeekNumber, PlayerID, Date, SUM(Score) AS Series 
          FROM Scores
      GROUP BY Season, LeagueName, WeekNumber, PlayerID)
GROUP BY LaneNumber

This works, as in I get the best three games for every single lane, which is actually what I want, but the other field containing the PlayerID isn't actually correct. 这是有效的,因为我为每个单独的车道获得了最好的三个游戏,这实际上是我想要的,但是包含PlayerID的另一个字段实际上并不正确。

In my table, the best score on lane number 24 (gotten from the SUM(Score) and GROUP BY Season, LeagueName, WeekNumber, PlayerID) is 848 and was played by the player that has PlayerID 36. 在我的表中,24号车道(从SUM(得分)和GROUP BY赛季,LeagueName,WeekNumber,PlayerID获得)的最高分是848,并且由具有PlayerID 36的玩家进行。

What I get instead is Lane 24 with 848 (which is correct), but the PlayedID returned is 3166. The same thing happens on every single lane. 我得到的是24号通道848(这是正确的),但PlayedID返回是3166.同样的事情发生在每一条车道上。 (As in, I get PlayerIDs that are plain out wrong. And if I had other columns in the first select, they're also wrong) (因为,我得到了明显错误的PlayerID。如果我在第一个选择中有其他列,那么它们也是错误的)

You are violating the semantics of GROUP BY . 您违反了GROUP BY的语义。

When using GROUP BY , it's only meaningful to SELECT columns that you have grouped by (eg LaneNumber ) and aggregate functions of the other columns (eg MAX(Series) ). 使用GROUP BY ,它只对SELECT分组的列(例如LaneNumber )和其他列的聚合函数(例如MAX(Series) )有意义。 It is not meaningful to select anything else (in your case, PlayerID ) because you don't specify which player ID you want among those that share the same LaneNumber . 这是没有意义的选择别的(在你的情况, PlayerID因为你不指定要那些共享同一其中玩家ID) LaneNumber

Sadly, MySql will by default let you do this without reporting an error, and it will return any value it chooses for the offending column. 遗憾的是,MySql默认情况下允许您在不报告错误的情况下执行此操作,并且它将返回它为违规列选择的任何值。 In your case, this means you are getting back a player ID "randomly" chosen among those that are included in the specified grouping. 在您的情况下,这意味着您将返回在指定分组中包含的玩家ID“随机”选择。

You are also doing this in the inner query, where you select LaneNumber , WeekNumber and Date . 您也在内部查询中执行此操作,您可以在其中选择LaneNumberWeekNumberDate

Solution

The query needs to be rewritten, but first you need to carefully specify exactly which results you want to get. 查询需要重写,但首先您需要仔细指定您想要获得的结果。 Do you want the best player and relevant data for each series (and any lane)? 您是否想要每个系列(以及任何通道)的最佳播放器和相关数据? For each series and lane separately? 对于每个系列和车道分开? The answer to this question will dictate what you need to GROUP BY , and by extension what the query will look like. 这个问题的答案将决定你对GROUP BY ,以及查询将会是什么样子。

Look here: http://dev.mysql.com/doc/refman/5.0/en/example-maximum-column-group-row.html 请看这里: http//dev.mysql.com/doc/refman/5.0/en/example-maximum-column-group-row.html

It might get messy trying to do it all in one query, but basically, you want to generate your series data as you did: 尝试在一个查询中完成所有操作可能会变得混乱,但基本上,您希望像以下一样生成系列数据:

SELECT Season, LeagueName, LaneNumber, WeekNumber, PlayerID, Date, SUM(Score) AS Series 
      FROM Scores
  GROUP BY Season, LeagueName, WeekNumber, PlayerID

Then, instead of getting max series values from this table, you will want to add a clause: WHERE Series= and then to get the right value, you need to do another select, where you get the max(Series) where LaneNumber is the same in both tables. 然后,你不想从这个表中获取max系列值,而是想要添加一个子句: WHERE Series=然后要获得正确的值,你需要做另一个select,你得到max(Series)其中LaneNumber是两个表都相同。 I would have coded it for you, but I am not confident enough in my MySQL abilities! 我会为你编写代码,但我对自己的MySQL能力不够自信!

As noted by @Jon, you need to remove those elements NOT applicable to specific person. 如@Jon所述,您需要删除那些不适用于特定人的元素。 Then, as @Ord had the closest sample, it would be best to pre-query the results into a separate table (not temporary as MySQL will choke on it trying to query from itself in a self-join in the second query). 然后,由于@Ord具有最接近的样本,最好将结果预先查询到一个单独的表中(这不是临时的,因为MySQL会在第二个查询中尝试从自身联接中查询自身时阻塞它)。

So, to me (having been a league bowler some years ago), and your content spanning ALL leagues, there would never be two different leagues on the same lane at the same time, however, for a full evening, you could have different leagues starting different times... 6-8:30, 8:45-11 for example... so grouping by the league and date would work. 所以,对我来说(几年前曾经是一名联赛投球手),而且你的内容涵盖了所有联赛,同一时间内同一条赛道永远不会有两个不同的联赛,但是,整整一个晚上,你可以拥有不同的联赛从不同的时间开始......例如6-8:30,8:45-11 ...所以联盟和日期的分组会起作用。 However, you DO need the player as part of the group by to get their respective SUM() values. 但是,您需要将播放器作为组的一部分来获取它们各自的SUM()值。

To help clarify the answers, let assume I have the following data. 为了帮助澄清答案,我假设我有以下数据。 This data will represent only a single lane, one week, one season, but two leagues and 3 players per league (for sole purpose of showing results and limiting content here) 这些数据仅代表单个车道,一周,一个赛季,但每个联赛有两个联赛和3名球员(仅用于显示结果和限制内容)

League   Player   Score
L1       1        223
L1       1        218
L1       1        204
L1       2        187
L1       2        201
L1       2        189
L1       3        148
L1       3        152
L1       3        158

L2       4        189
L2       4        195
L2       4        192
L2       5        182
L2       5        199
L2       5        209
L2       6        228
L2       6        234
L2       6        218


CREATE TABLE SeriesScores2
SELECT 
      Season, 
      LeagueName, 
      LaneNumber, 
      WeekNumber, 
      PlayerID, 
      SUM(Score) AS Series 
   FROM 
      Scores
   GROUP BY 
      Season, 
      LeagueName, 
      LaneNumber, 
      WeekNumber, 
      PlayerID;

The first query (above) would create will create the series for all players all weeks, all leagues, etc.. Assume now I've added in the common season, lane, week too 第一个查询(上图)将创建将为所有玩家创建所有周,所有联赛等系列。假设现在我已经添加了在共同的季节,车道,周

Season   League   Lane   Week   Player   Series
1        L1       1      1      1        645
1        L1       1      1      2        577
1        L1       1      1      3        458
1        L2       1      1      4        576
1        L2       1      1      5        590
1        L2       1      1      6        680

This gives us the precursor to determining the max(), otherwise we'd have to duplicate the query inside itself and at the outer level making it more complicated than this pre-aggregation. 这给了我们确定max()的前提,否则我们必须在自身内部和外层重复查询,这使得它比这个预聚合更复杂。

Now, the above permanent table (can be deleted AFTER getting results), query the FIRST (PreQuery) for the maximum score PER LEAGUE PER LANE... Ex:, its common that a men's league will typically have higher series scores than women... similar with different age groups. 现在,上面的永久表(可以在获得结果后删除),查询FIRST(PreQuery)获得最高得分PER LEAGUE PER LANE ...例如:男性联盟通常会比女性更高的系列得分。 ..与不同年龄组相似。 So, Men's league Lane 1 highest score and Women's League Lane 1 highest score, etc.. Highest score typically identified by the single week out of the entire season, not highest series per lane every week. 所以,男子联赛第1的最高分和女子联赛第1的最高分等等。最高分通常由整个赛季的单周确定,而不是每周最高系列。

Now, PreQuery "ss" alias is just on the season, league, lane and maximum series. 现在,PreQuery“ss”的别名就在赛季,联赛,车道和最大系列赛上。 Once THAT is known, self-join to the series score to pull in WHO DID that highest score on said lane and pull the who and what week it occurred 一旦知道了这一点,就自行加入到系列得分中,以获得所述车道上最高得分的WHO DID并拉出谁和发生的那一周

select
      ss.season, 
      ss.leaguename, 
      ss.lanenumber, 
      ss.highestSeries, 
      ss2.PlayerID, 
      ss2.WeekNumber
   from
      ( select season, leaguename, lanenumber, max( series ) highestSeries
           from SeriesScores2
           group by season, leaguename, lanenumber ) ss
      join SeriesScores2 ss2
         on ss.Season = ss2.Season
        and ss.LeagueName = ss2.LeagueName
        and ss.LaneNumber = ss2.LaneNumber
        and ss.HighestSeries = ss2.Series

Now, from the above query... lets break it down. 现在,从上面的查询...让我们分解它。 If we take the inner "ss" prequery 如果我们采取内在的“ss”预先查询

  ( select season, leaguename, lanenumber, max( series ) highestSeries
       from SeriesScores2
       group by season, leaguename, lanenumber ) ss

We will get the highest scores per league (ex: Men's league vs Women's league on same week, same night, same lane and we find (below), just by max, but don't have the WHO or what week, just the highest series bowled regardless of week or person. So THIS becomes the basis of the JOIN back to the pre-aggregated table "SeriesScores2", yet here, we have the highest series score to ensure we find the correct person 我们将获得每个联赛的最高得分(例如:男子联赛与女子联赛同一周,同一夜,同一车道,我们找到(下图),只是按最大值,但没有世界卫生组织或什么周,只有最高无论是一周还是一个人,系列都会保持不变。因此,这成为JOIN回到预先汇总的表格“SeriesScores2”的基础,但在这里,我们有最高的系列分数,以确保我们找到正确的人

Season  League   Lane   HighestSeries
1       L1       1      645
1       L2       1      680

To refresh preaggregation
Season   League   Lane   Week   Player   Series
1        L1       1      1      1        645    <-- Join finds THIS entry League 1
1        L1       1      1      2        577
1        L1       1      1      3        458
1        L2       1      1      4        576
1        L2       1      1      5        590
1        L2       1      1      6        680    <-- Join finds THIS entry League 2

So, my original queries did work as I tested them before posting. 所以,我的原始查询确实有效,因为我在发布前测试了它们。 I don't know what hiccup you had on yours unless column name not correct or something. 除非专栏名称不正确或其他什么,否则我不知道你对你的打嗝。 As with respect to the "Date" column, I didn't particularly care because you had the week number available which would correspond to the week of bowling and would be a 1:1 relationship to a date anyhow. 至于“日期”栏目,我并不特别在意,因为你有可用的周数,这对应于保龄球周,并且无论如何都是1:1的关系。 The date column could have been added to the pre-aggregation SeriesScores2 and pull along when getting the person's ID and week. 可以将日期列添加到预聚合SeriesScores2中,并在获取人员ID和周时拉出。 (unless a league bowls on multiple nights in the same week, THEN you would need the explicit date). (除非联盟在同一个星期的多个晚上投球,然后你需要明确的日期)。

Hope this clarifies your questions / comments. 希望这能澄清您的问题/意见。

Okay, attempting to actually write the MySQL code I was thinking of (I couldn't resist...): 好吧,尝试实际编写我想到的MySQL代码(我无法抗拒......):

CREATE TEMPORARY TABLE SeriesScores
SELECT Season, LeagueName, LaneNumber, WeekNumber, PlayerID, SUM(Score) AS Series 
    FROM Scores
    GROUP BY Season, LeagueName, WeekNumber, PlayerID;

This bit just gets scores for each series, as you specified in your own MySQL code. 这个位只会得到每个系列的分数,正如您在自己的MySQL代码中指定的那样。 Only difference is that I am not selecting Date, because since we are not grouping by it, its value will be random. 唯一不同的是我没有选择日期,因为我们不按它分组,它的值将是随机的。 Then: 然后:

SELECT PlayerID, LaneNumber, Series
    FROM SeriesScores s1
    WHERE  Series=(SELECT MAX(s2.Series)
          FROM SeriesScores s2
          WHERE s1.LaneNumber = s2.LaneNumber);

This bit just selects what you need from SeriesScores, and only considers rows where the series score is the max for that lane. 此位仅从SeriesScores中选择您需要的内容,并且仅考虑系列分数是该通道的最大值的行。

Does that work for you? 那对你有用吗?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM