[英]How to create a list of N items with a budget constraint and multiple conditions on Python
I have the following df of Premier League players (ROI_top_players):我有以下英超球员 (ROI_top_players):
player team position cost_2223 total_points ROI
0 Mohamed Salah Liverpool FWD 13.0 259 29.77
1 Trent Alexander Liverpool DEF 8.4 206 24.52
2 Jarrod Bowen West Ham MID 8.5 204 23.56
3 Kevin De Bruyne Man City MID 12.0 190 15.70
4 Virgil van Dijk Liverpool DEF 6.5 183 14.91
... ... ... ... ... ... ... ... ... ...
151 Jamaal Lascelles Newcastle DEF 4.5 45 10.22
152 Ben Godfrey Everton GKP 4.5 45 9.57
153 Aaron Wan-Bissaka Man Utd DEF 4.5 41 8.03
154 Brandon Williams Norwich DEF 4.0 36 7.23
I want to create a list of 15 players (must be 15 - not more, not less), with the highest ROI possible, and it has to fulfill certain conditions:我想创建一个包含 15 名玩家的列表(必须是 15 名 - 不多也不少),具有尽可能高的投资回报率,并且必须满足某些条件:
Here's my current code:这是我当前的代码:
def get_ideal_team_ROI(budget = 100, star_player_limit = 3, gk = 2, df = 5, md = 5, fwd = 3):
money_team = []
budget = budget
positions = {'GK': gk, 'DEF': df, 'MID': md, 'FWD': fwd}
for index, row in ROI_top_players.iterrows():
if (budget >= row['cost_2223'] and positions[row['position']] > 0):
money_team.append(row['player'])
budget -= row['cost_2223']
positions[row['position']] = positions[row['position']] - 1
return money_team
This code has two problems:这段代码有两个问题:
How should I tackle this?我应该如何解决这个问题? I want my code to make sure that I always have enough budget to buy 15 players and that I always have at maximum 3 players per team.我希望我的代码确保我总是有足够的预算购买 15 名球员,并且我总是每支球队最多有 3 名球员。
**I do not need all possible combinations. **我不需要所有可能的组合。 Just ONE team with the highest possible ROI.只有一个团队具有最高的投资回报率。
This type of question pops up now and again, and it has been solved a couple of times.这类问题时不时出现,已经解决了好几次了。 For example, you have the following (not a duplicate per se, but close enough to get inspired from): Python PuLP Optimization problem for Fantasy Football, how to add Certain Conditional Constraints?例如,您有以下内容(本身不是重复的,但足够接近以从中获得灵感): Python 幻想足球的纸浆优化问题,如何添加某些条件约束? And you have this Medium article: https://medium.com/ml-everything/using-python-and-linear-programming-to-optimize-fantasy-football-picks-dc9d1229db81 And this: https://towardsdatascience.com/how-to-build-a-fantasy-premier-league-team-with-data-science-f01283281236你有这篇中等文章: https://medium.com/ml-everything/using-python-and-linear-programming-to-optimize-fantasy-football-picks-dc9d1229db81这个: https://towardsdatascience. /how-to-build-a-fantasy-premier-league-team-with-data-science-f01283281236
Now, mostly in an attempt to understand it myself, I will attempt a solution (this is just standing on the above solutions, nothing uniquely brilliant here).现在,主要是为了自己理解它,我将尝试一个解决方案(这只是站在上述解决方案上,这里没有什么特别出色的)。 As OP did not provide the data, I went and scraped the first 'Fantasy Football players list' I could find.由于 OP 没有提供数据,我去抓取了我能找到的第一个“梦幻足球运动员名单”。 There is no ROI in that data, however there are 'Points', which we will try to maximize, so I guess OP can apply this to maximize the ROI in his data.该数据中没有投资回报率,但是有“点”,我们将尝试最大化,所以我猜 OP 可以应用它来最大化他数据中的投资回报率。
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time as t
import pandas as pd
from pulp import *
## get some data approximating OP's data
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")
webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)
big_df = pd.DataFrame()
url = 'https://fantasy.premierleague.com/player-list/'
browser.get(url)
try:
WebDriverWait(browser, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Accept All Cookies']"))).click()
print('cookies accepted')
except Exception as e:
print('no cookies for you!')
tables_divs = WebDriverWait(browser, 20).until(EC.presence_of_all_elements_located((By.XPATH, "//table/parent::div/parent::div")))
for t in tables_divs:
category = t.find_element(By.TAG_NAME, 'h3')
print(category.text)
WebDriverWait(t, 20).until(EC.presence_of_all_elements_located((By.XPATH, "//table")))
dfs = pd.read_html(t.get_attribute('outerHTML'))
for df in dfs:
df['Type'] = category.text
big_df = pd.concat([big_df, df], axis=0, ignore_index=True)
big_df.to_json('f_footie.json')
browser.quit()
footie_df = pd.read_json('f_footie.json')
footie_df.columns = ['Player', 'Team', 'Points', 'Cost', 'Position']
footie_df['Player'] = footie_df.apply( lambda row: row.Player.replace(' ', '_').strip(), axis=1)
footie_df['Cost'] = footie_df.apply( lambda row: row.Cost.split('£')[1], axis=1)
footie_df['Cost'] = footie_df['Cost'].astype('float')
footie_df['Points'] = footie_df['Points'].astype('int')
print(footie_df)
## constraining variables
positions = footie_df.Position.unique()
clubs = footie_df.Team.unique()
budget = 100
available_roles = {
'Goalkeepers': 2,
'Defenders': 5,
'Midfielders': 5,
'Forwards': 3
}
names = [footie_df.Player[i] for i in footie_df.index]
teams = [footie_df.Team[i] for i in footie_df.index]
roles = [footie_df.Position[i] for i in footie_df.index]
costs = [footie_df.Cost[i] for i in footie_df.index]
points = [footie_df.Points[i] for i in footie_df.index]
players = [LpVariable("player_" + str(i), cat="Binary") for i in footie_df.index]
prob = LpProblem("Secret Fantasy Player Choices", LpMaximize)
## define the objective -> maximize the points
prob += lpSum(players[i] * points[i] for i in range(len(footie_df)))
## define budget constraint
prob += lpSum(players[i] * footie_df.Cost[footie_df.index[i]] for i in range(len(footie_df))) <= budget
for pos in positions:
prob += lpSum(players[i] for i in range(len(footie_df)) if roles[i] == pos) <= available_roles[pos]
## add max 3 per team constraint
for club in clubs:
prob += lpSum(players[i] for i in range(len(footie_df)) if teams[i] == club) <= 3
prob.solve()
df_list = []
for variable in prob.variables():
if variable.varValue != 0:
name = footie_df.Player[int(variable.name.split("_")[1])]
club = footie_df.Team[int(variable.name.split("_")[1])]
role = footie_df.Position[int(variable.name.split("_")[1])]
points = footie_df.Points[int(variable.name.split("_")[1])]
cost = footie_df.Cost[int(variable.name.split("_")[1])]
df_list.append((name, club, role, points, cost))
# print(name, club, position, points, cost)
result_df = pd.DataFrame(df_list, columns = ['Name', 'Club', 'Role', 'Points', 'Cost'])
result_df.to_csv('win_at_fantasy_football.csv')
print(result_df)
This will display some control printouts, the data scraped, the long printout from pulp solver, and the result dataframe in the end, looking like this:这将显示一些控制打印输出、抓取的数据、纸浆求解器的长打印输出,以及最后的结果 dataframe,如下所示:
Name姓名 | Club俱乐部 | Role角色 | Points积分 | Cost成本 | |
---|---|---|---|---|---|
0 0 | Alisson阿利松 | Liverpool利物浦 | Goalkeepers守门员 | 176 176 | 5.5 5.5 |
1 1 | Lloris洛里斯 | Spurs马刺队 | Goalkeepers守门员 | 158 158 | 5.5 5.5 |
2 2 | Bowen鲍文 | West Ham西汉姆 | Midfielders中场 | 206 206 | 8.5 8.5 |
3 3 | Saka坂 | Arsenal兵工厂 | Midfielders中场 | 179 179 | 8 8 |
4 4 | Maddison麦迪逊 | Leicester莱斯特 | Midfielders中场 | 181 181 | 8 8 |
5 5 | Ward-Prowse沃德-普劳斯 | Southampton南安普敦 | Midfielders中场 | 159 159 | 6.5 6.5 |
6 6 | Gallagher加拉格尔 | Chelsea切尔西 | Midfielders中场 | 140 140 | 6 6 |
7 7 | Antonio安东尼奥 | West Ham西汉姆 | Forwards前锋 | 140 140 | 7.5 7.5 |
8 8 | Toney托尼 | Brentford布伦特福德 | Forwards前锋 | 139 139 | 7 7 |
9 9 | Mbeumo姆贝莫 | Brentford布伦特福德 | Forwards前锋 | 119 119 | 6 6 |
10 10 | Alexander-Arnold亚历山大-阿诺德 | Liverpool利物浦 | Defenders后卫 | 208 208 | 7.5 7.5 |
11 11 | Robertson罗伯逊 | Liverpool利物浦 | Defenders后卫 | 186 186 | 7 7 |
12 12 | Cancelo坎塞洛 | Man City曼城 | Defenders后卫 | 201 201 | 7 7 |
13 13 | Gabriel加布里埃尔 | Arsenal兵工厂 | Defenders后卫 | 146 146 | 5 5 |
14 14 | Cash现金 | Aston Villa阿斯顿维拉 | Defenders后卫 | 147 147 | 5 5 |
In case there are better solutions, please, whoever is more seasoned in optimisations problems, chime in with critique.如果有更好的解决方案,请在优化问题方面经验丰富的人提出批评。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.