[英]Invalid result in colley ranking algorithm implementation in python
我在这篇文章之后为体育比赛实现了 colley 排名算法: https://towardsdatascience.com/generate-sports-rankings-with-data-science-4dd1979571da
但是我得到了无效的结果。 r 结果应该是获胜概率,所以它必须在 0 和 1 之间。但是当使用大量输入时,我得到了 1.4 和负结果。
这是Colley算法的一个已知问题吗? 是否有可以正确处理大量数据的修复或替代算法?
我的代码:
import json
import numpy as np
c=None
b=None
def iter_game(t1, t2, r1, r2):
global c, b
# Updating vecotr b based on result of each game
if r1 > r2:
b[(t1 - 1)] += 1
b[(t2 - 1)] -= 1
elif r1 < r2:
b[(t1 - 1)] -= 1
b[(t2 - 1)] += 1
else: return
c[(t1 - 1)][(t1 - 1)] += + 1 # Updating diagonal element
c[(t2 - 1)][(t2 - 1)] += + 1 # Updating diagonal element
c[(t1 - 1)][(t2 - 1)] -= 1 # Updating off - diagonal element
c[(t2 - 1)][(t1 - 1)] -= 1 # Updating off - diagonal element
def main():
global c,b
num_players = 2537
# Initializing Colley Matrix 'c'and vector 'b'
c = np.zeros([num_players, num_players])
b = np.zeros(num_players)
with open("colley_games.json") as f:
games = json.load(f)
for game in games:
iter_game(game["home"], game["away"], game["score_home"], game["score_away"])
# Adding 2 to diagonal elements (total number of games) of Colley matrix
diag = c.diagonal() + 2
np.fill_diagonal(c, diag)
# Dividing by 2 and adding one to vector b
for i, value in enumerate(b):
b[i] = b[i] / 2
b[i] += 1
# Solving N variable linear equation
r = np.linalg.solve(c, b)
# Displaying ranking for top 10 teams
top_teams = r.argsort()[-10:][::-1]
for i in top_teams:
print (str(r[i]) + " " + str(i))
print ("----------------------------")
# Displaying ranking for lower 10 teams
top_teams = r.argsort()[:10][::-1]
for i in top_teams:
print (str(r[i]) + " " + str(i))
main()
使用colley_games.json输入执行时的结果:
# python3 rank_colley_ask.py
1.409508465374069 2135
1.1358580974322448 1759
1.134486271801534 2126
1.1314563266569193 1763
1.0930304236523831 2134
1.0809741214865278 1243
1.0633760655215825 2143
1.049467222041803 1748
1.0391031285438894 1470
1.0288821673935697 1453
----------------------------
-0.1304799893162797 1954
-0.15012844703440156 1929
-0.19462901224272772 2121
-0.20745023863077341 1930
-0.21188300405221577 890
-0.24910253479694192 968
-0.25265547797693333 2155
-0.34306068196974493 930
-0.3468485876254179 913
-0.3792348796324475 2151
在这里你有小提琴: https://pyfiddle.io/fiddle/ffc0e2d7-3d47-4ee9-b9f8-9c28e6e3b500/?i=true (我不得不gzip,因为最大上传是1MB)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.