简体   繁体   English

从数百万条记录中选择排名

[英]selecting rank from millions of records

I will ask SO for help since I could not find similar situation, questions/post with my question 我将向SO寻求帮助,因为我找不到类似的情况,问题/问题所在

Assuming I have millions of records, columns are 假设我有数百万条记录,列是
user_id - assuming its recorded from 1 to 1,000,000 user_id-假设其记录为1到1,000,000
name - assume its also recorded up to 20 characters in alphabet 名称-假设其也记录了最多20个字母的字母
score - 0 to 100 assuming its also recorded 得分-0到100(假设其也已记录)
date - date it was recorded (timestamp) 日期-记录的日期(时间戳)

user_id |   name   | score |        date       |
------------------------------------------------
23131   |   name1  |   15  | 2017-01-04 02:01:25
26824   |   name2  |   63  | 2017-01-04 02:41:33
19684   |   name3  |   28  | 2017-01-04 02:56:15
74937   |   name4  |   01  | 2017-01-04 04:07:55
27486   |   name5  |   75  | 2017-01-04 13:07:45
86476   |   name6  |   56  | 2017-01-04 14:21:47
36479   |   name7  |   19  | 2017-01-04 17:11:15
86752   |   name8  |   38  | 2017-01-04 18:22:23
11267   |   name9  |  100  | 2017-01-04 20:34:42
88763   |   name10 |   89  | 2017-01-04 22:45:43
  1. I want to know my own rank assuming I know what my user_id is 我想知道自己的等级,前提是我知道我的user_id是什么
  2. I also want to get other 10 user records above and below my ranking, let's say my rank is 100, I also want to select user with rank of 90 to 99 (above my rank) and 101 to 110 (below my rank). 我还想获得其他10个用户记录,这些记录在我的排名之上和之下,假设我的排名是100,我也想选择排名为90到99(高于我的排名)和101到110(低于我的排名)的用户。
    if different user has same score order rank by date recorded, earlier record has higher rank. 如果不同的用户按记录的日期具有相同的得分顺序排名,则较早的记录具有较高的排名。

is it possible? 可能吗?
assuming all records are unique and no index are set. 假设所有记录都是唯一的,并且未设置索引。

I know how to sort ranking 我知道如何对排名进行排序

SELECT * FROM record order by score

but this would let me select all the records, what are the practical way of selecting specific data without selecting every record? 但这会让我选择所有记录,不选择每条记录而选择特定数据的实际方法是什么?

here is what I would like to achieve 这是我想要实现的

user_id |   name   | score |        date          |     rank     |
------------------------------------------------------------------
12341   |   namep  |   90  | 2017-01-01 04:02:36  |      90      |
45341   |   nameo  |   88  | 2017-01-02 00:05:45  |      91      |
24341   |   namex  |   88  | 2017-01-03 00:11:15  |      92      |
26867   |   namec  |   83  | 2017-01-03 01:41:23  |      93      |
19156   |   nameb  |   81  | 2017-01-03 02:36:45  |      94      |
74973   |   namem  |   79  | 2017-01-03 04:07:55  |      95      |
23134   |   namek  |   78  | 2017-01-04 02:01:25  |      96      |
21424   |   namet  |   77  | 2017-01-04 02:41:33  |      97      |
19534   |   nameg  |   77  | 2017-01-04 02:56:15  |      98      |
74912   |   namez  |   75  | 2017-01-04 04:07:55  |      99      |

my_uid  |  my_name |   75  | 2017-01-04 13:07:45  |     100      |

86766   |   namen  |   75  | 2017-01-04 14:21:47  |     101      |
67976   |   namey  |   74  | 2017-01-04 16:22:23  |     102      |
34676   |   nameu  |   74  | 2017-01-04 17:33:32  |     103      |
86236   |   namei  |   73  | 2017-01-04 18:11:09  |     104      |
98636   |   nameo  |   73  | 2017-01-04 19:21:47  |     105      |
14326   |   namep  |   73  | 2017-01-04 20:33:22  |     106      |
45333   |   namet  |   72  | 2017-01-04 20:44:12  |     107      |
33323   |   namer  |   72  | 2017-01-04 21:34:26  |     108      |
11322   |   namee  |   71  | 2017-01-04 22:51:54  |     109      |
86633   |   namew  |   70  | 2017-01-04 22:55:33  |     110      |

ok so here is what I got as of now, sorry that I did not mention anything about not using union or union all, I cannot use that in my project. 好的,所以这是我到目前为止的结果,很抱歉,我没有提到不使用union或union all的任何内容,我无法在我的项目中使用它。

but anyway here is my query I used "multi_query()" function 但是无论如何这是我使用“ multi_query()”函数的查询

$sql = "SELECT score, date FROM table_name WHERE user_id=your_user_id;" //assume you already know your user_id
$sql .= "SELECT name, score, date FROM table_name WHERE score >= your_score ORDER BY score, date LIMIT 10;"; //to get 10 rows that have greater or same score of your score order by date, earlier date is higher rank if score is the same with other user.
$sql .= "SELECT name, score, date table_name WHERE score <= your_score DESC, date ASC LIMIT 10"; //select score less than or equal to my score order by score and date

and I get something like this 我得到这样的东西

my_uid  |  my_name |   75  | 2017-01-04 13:07:45  |     100      |

12341   |   namep  |   90  | 2017-01-01 04:02:36  |      90      |
45341   |   nameo  |   88  | 2017-01-02 00:05:45  |      91      |
24341   |   namex  |   88  | 2017-01-03 00:11:15  |      92      |
26867   |   namec  |   83  | 2017-01-03 01:41:23  |      93      |
19156   |   nameb  |   81  | 2017-01-03 02:36:45  |      94      |
74973   |   namem  |   79  | 2017-01-03 04:07:55  |      95      |
23134   |   namek  |   78  | 2017-01-04 02:01:25  |      96      |
21424   |   namet  |   77  | 2017-01-04 02:41:33  |      97      |
19534   |   nameg  |   77  | 2017-01-04 02:56:15  |      98      |
74912   |   namez  |   75  | 2017-01-04 04:07:55  |      99      |

74912   |   namez  |   75  | 2017-01-04 04:07:55  |      99      |
my_uid  |  my_name |   75  | 2017-01-04 13:07:45  |     100      |
86766   |   namen  |   75  | 2017-01-04 14:21:47  |     101      |
67976   |   namey  |   74  | 2017-01-04 16:22:23  |     102      |
34676   |   nameu  |   74  | 2017-01-04 17:33:32  |     103      |
86236   |   namei  |   73  | 2017-01-04 18:11:09  |     104      |
98636   |   nameo  |   73  | 2017-01-04 19:21:47  |     105      |
14326   |   namep  |   73  | 2017-01-04 20:33:22  |     106      |
45333   |   namet  |   72  | 2017-01-04 20:44:12  |     107      |
33323   |   namer  |   72  | 2017-01-04 21:34:26  |     108      |

my problem is when using multiple query, it is still the same as doing 3 different query since I have 3 queries, how can I combine it as one? 我的问题是使用多个查询时,它仍然与执行3个不同的查询相同,因为我有3个查询,如何将其组合为一个? without using union or union all? 不使用工会或工会所有?
and on the 3rd query how can I set my starting point from my data? 在第三个查询中,如何从数据中设置起点?

In MySQL, view is like a function in other languages. 在MySQL中,视图就像其他语言中的函数一样。

CREATE VIEW rankall AS SELECT * FROM record ORDER BY score;

SELECT * FROM rankall
    WHERE rank > (SELECT rank FROM rankall WHERE user_id = ID) - 11
    AND rank < (SELECT rank FROM rankall WHERE user_id = ID) + 11;

That's the basic idea, but no guarantee that above code will work :P 这是基本思想,但不能保证上面的代码会起作用:P

Try this 尝试这个

SELECT user_id, name, score FROM record
WHERE score BETWEEN 
CONVERT((SELECT score FROM record WHERE user_id = (SELECT user_id FROM record WHERE score = '100') ), INTEGER) - 10
AND
CONVERT((SELECT score FROM record WHERE user_id = (SELECT user_id FROM record WHERE score = '100') ), INTEGER) + 10
ORDER BY score DESC, date DESC

Possible way to do it, but there are issues which will need work. 可能的方法,但是有些问题需要解决。 This gets the records either side (score wise) of the chosen user, along with the rank of that user. 这将获取所选用户的任一侧(得分较高)的记录以及该用户的排名。 Then orders that by score and calculates the ranking. 然后按分数排序并计算排名。 This does work normally, but will mess up if the chosen user is the top / bottom scorer. 这确实可以正常工作,但是如果所选用户是得分最高/最低的得分手,那将会很混乱。 Would be possible to sort this, but not sure with real data it will be usable anyway. 可以对此进行排序,但不确定真实数据是否仍然可以使用。 But will give you some ideas. 但是会给你一些想法。

This is using user id 86474 as the one you are interested in, and just getting 1 either side - just to suit the test data you have given:- 这是使用用户ID 86474作为您感兴趣的用户,并且只获得1的一方-只是为了适应您提供的测试数据:-

SELECT user_id,
        name, 
        score, 
        date,
        def_rank - (@ranking := @ranking -1) AS rank
FROM
(
    SELECT *
    FROM
    (
        (SELECT r1.user_id,
                r1.name, 
                r1.score, 
                r1.date,
                sub0.def_rank
        FROM record r1
        INNER JOIN record r2 ON r2.user_id = 86476
        CROSS JOIN 
        (
            SELECT COUNT(*) def_rank
            FROM record r1
            INNER JOIN record r2 ON r2.user_id = 86476
            WHERE r1.score >= r2.score
        ) sub0
        WHERE r1.score >= r2.score
        ORDER BY score ASC
        LIMIT 2) 
        UNION
        (SELECT r1.user_id,
                r1.name, 
                r1.score, 
                r1.date,
                sub0.def_rank
        FROM record r1
        INNER JOIN record r2 ON r2.user_id = 86476
        CROSS JOIN 
        (
            SELECT COUNT(*) def_rank
            FROM record r1
            INNER JOIN record r2 ON r2.user_id = 86476
            WHERE r1.score >= r2.score
        ) sub0
        WHERE r1.score <= r2.score
        ORDER BY score DESC
        LIMIT 2)
    ) sub97
    ORDER BY score
) sub1
CROSS JOIN 
(
    SELECT @ranking := 2
) sub2

I finally got what I wanted to get, so I will just answer my own question 我终于得到了想要得到的东西,所以我只会回答我自己的问题

SELECT score, date FROM rank WHERE uid=your_user_id; //your score and date recorded
SELECT (count(*) + 1) AS rank FROM rank WHERE score > your_score OR (score = your_score AND date < date of your score recorded); //your rank
SELECT * FROM rank WHERE score > your_score OR (score = your_score AND date < date of your score recorded) ORDER BY score ASC, date DESC LIMIT 10; //10 users above my rank, in your output you have to reverse the order
SELECT * FROM rank WHERE score < your_score OR (score = your_score AND date > date of your score recorded) ORDER BY score DESC, date ASC LIMIT 10; //10 users below my rank

thank you for other users who replied :) 感谢其他用户的回答:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM