简体   繁体   English

如何比较同一表的行并将另一个表中的权重应用于每一列

[英]How to compare rows of the same table and apply a weight from another table to each column

I am trying to build a sort of 'personals' matching database in MySQL.我正在尝试在 MySQL 中构建一种“个人”匹配数据库。

  • I need to compare a specific person in the Members table to all of the others in the same table.我需要将成员表中的特定人员与同一表中的所有其他人进行比较。

  • Each row (person) in the table has a number of columns with their info (age, location, religion, etc).表中的每一行(人)都有许多列,其中包含他们的信息(年龄、位置、宗教等)。

  • I need the query to reference a table that holds my 'weights' for each column.我需要查询来引用一个表,该表包含我的每一列的“权重”。 In other words, I want to say "location is important at 75, age-range is super important at 100, and religion is not important at 10".换句话说,我想说“75岁时位置很重要,100岁时年龄范围非常重要,而10岁时宗教不重要”。

Members Table成员表

+----+-------+----------+----------+-----+----------+
| ID | Name  | Location | Religion | Age | AgeRange |
+----+-------+----------+----------+-----+----------+
| 1  | Joe   | LA       | Athiest  | 40  | 30-40    |
+----+-------+----------+----------+-----+----------+
| 2  | Mary  | LA       | Agnostic | 35  | 35-45    |
+----+-------+----------+----------+-----+----------+
| 3  | Karen | NYC      | Athiest  | 45  | 30-35    |
+----+-------+----------+----------+-----+----------+
| 4  | Lisa  | LA       | Hindu    | 30  | 45-55    |
+----+-------+----------+----------+-----+----------+

Weights Table (how important a parameter is)权重表(参数的重要性)

+----+-----+----------+----------+
| ID | Age | Location | Religion |
+----+-----+----------+----------+
| 1  | 100 | 75       | 10       |
+----+-----+----------+----------+

I have tried many things over the past 2 days but the latest query I was trying to play with was this, which obviously is not of huge use.在过去的 2 天里,我尝试了很多东西,但我尝试使用的最新查询是这个,这显然没有多大用处。 It also doesn't specify the 'person' to whom these records would be compared.它也没有指定将与这些记录进行比较的“人”。

SELECT  a.first_name,
        g.name, 
        a.age* g.age+
        a.location* g.location+
        a.religion * g.mwReligion AS metric
    FROM members a, weight g  
    ORDER BY metric DESC;

My intended output would be like this:我的预期输出是这样的:

Joe Matches:乔比赛:

Mary - Score = 285玛丽 - 分数 = 285
(100 because she's in his AgeRange + 100 because he is in her AgeRange + 75 for Location + 10 for religion) (100 因为她在他的年龄范围内 + 100 因为他在她的年龄范围内 + 75 代表位置 + 10 代表宗教)

Lisa - Score = 175 (100 because she is in his AgeRange + 75 for location)丽莎 - 分数 = 175(100 因为她在他的年龄范围内 + 75 位置)

Karen - Score = 10 (Only religion matches)凯伦 - 分数 = 10(仅宗教匹配)

I would assume the min_age and max_age columns are separate (instead of AgeRange ), inclusive, and of INT data type.我假设min_agemax_age列是分开的(而不是AgeRange ),包括在内,并且是INT数据类型。 The query you need should look like:您需要的查询应如下所示:

select 
  x.id,
  x.name,
  x.ma as match_age,
  x.ml as match_location,
  x.mr as match_religion,
  x.ma * w.age + x.ml * w.location + x.mr * w.religion as total_score
from (
  select
    o.id,
    o.name,
    case when o.age between p.min_age and p.max_age then 1 else 0 end as ma,
    case when o.location = p.location then 1 else 0 end as ml,
    case when o.religion = p.religion then 1 else 0 end as mr
  from (select * from members where id = 1) p -- selects Joe
  cross join (select * from members where id <> 1) o -- select other members
) x
cross join weights w

In MySQL Boolean expression become 0 or 1 in numeric context.在 MySQL 中,布尔表达式在数字上下文中变为 0 或 1。 So you can use your comparisons as factor.所以你可以使用你的比较作为因素。

So self join the members on the id of the one being lower than the others (otherwise, ie when just checking for inequality, you had each pair twice in the result).因此,self 将成员的 id 加入到比其他成员低的成员中(否则,即在检查不等式时,结果中每对都有两次)。 Then cross join the weights.然后交叉加入权重。

Now you can build your metric as sum of the multiplications of the comparisons and the weights.现在,您可以将指标构建为比较和权重的乘积之和。

I assume religion and comparison are compared by equality.我假设宗教和比较是通过平等来比较的。 The age of one person is compared to the age range of the other and the same vice versa.一个人的年龄与另一个人的年龄范围进行比较,反之亦然。 Further more I take the age range as splited into a lower and an upper bound and assume that the range's bounds are inclusive.此外,我将年龄范围分为下限和上限,并假设该范围的界限是包含在内的。 Then this could look like the following:那么这可能如下所示:

SELECT m1.name,
       m2.name,
       (m1.age BETWEEN m2.agerangelower
                       AND m2.agerangeupper) * w1.age
       +
       (m2.age BETWEEN m1.agerangelower
                       AND m1.agerangeupper) * w1.age
       +
       (m1.location = m2.location) * w1.location
       +
       (m1.religion = m2.religion) * w1.religion metric
       FROM members m1
            INNER JOIN members m2
                       ON m1.id < m2.id
            CROSS JOIN weights w1;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM