简体   繁体   English

如何在 Python 上创建具有预算约束和多个条件的 N 项列表

[英]How to create a list of N items with a budget constraint and multiple conditions on Python

I have the following df of Premier League players (ROI_top_players):我有以下英超球员 (ROI_top_players):

    player                 team         position    cost_2223   total_points  ROI   
0   Mohamed Salah          Liverpool    FWD         13.0        259           29.77 
1   Trent Alexander        Liverpool    DEF         8.4         206           24.52 
2   Jarrod Bowen           West Ham     MID         8.5         204           23.56
3   Kevin De Bruyne        Man  City    MID         12.0        190           15.70
4   Virgil van Dijk        Liverpool    DEF         6.5         183           14.91 
... ... ... ... ... ... ... ... ... ...
151 Jamaal Lascelles       Newcastle    DEF         4.5         45            10.22
152 Ben Godfrey            Everton      GKP         4.5         45            9.57  
153 Aaron Wan-Bissaka      Man Utd      DEF         4.5         41            8.03
154 Brandon Williams       Norwich      DEF         4.0         36            7.23  

I want to create a list of 15 players (must be 15 - not more, not less), with the highest ROI possible, and it has to fulfill certain conditions:我想创建一个包含 15 名玩家的列表(必须是 15 名 - 不多也不少),具有尽可能高的投资回报率,并且必须满足某些条件:

  • Position constraints : it must have 2 GKP, 5 DEF, 5 MID, and 3 FWD Position 约束:它必须有 2 个 GKP、5 个 DEF、5 个 MID 和 3 个 FWD
  • Budget constraint : I have a budget of $100, so for each player I add to the list, I must subtract the player's cost (cost_2223) from the budget.预算约束:我有 100 美元的预算,所以对于我添加到列表中的每个玩家,我必须从预算中减去玩家的成本 (cost_2223)。
  • Team constraint : It can't have more than 3 players per club.球队限制:每家具乐部不能超过3名球员。

Here's my current code:这是我当前的代码:

def get_ideal_team_ROI(budget = 100, star_player_limit = 3, gk = 2, df = 5, md = 5, fwd = 3):
    money_team = []
    budget = budget
    positions = {'GK': gk, 'DEF': df, 'MID': md, 'FWD': fwd}
    for index, row in ROI_top_players.iterrows():
       if (budget >= row['cost_2223'] and positions[row['position']] > 0):
           money_team.append(row['player'])
           budget -= row['cost_2223']
           positions[row['position']] = positions[row['position']] - 1
    return money_team

This code has two problems:这段代码有两个问题:

  1. It creates the list BUT, the list does not end up with 15 players.它创建了列表但是,列表并没有以 15 名玩家结束。
  2. It doesn't fulfill the team constraint (I have more than 3 players per team).它不满足团队限制(我每队有超过 3 名球员)。

How should I tackle this?我应该如何解决这个问题? I want my code to make sure that I always have enough budget to buy 15 players and that I always have at maximum 3 players per team.我希望我的代码确保我总是有足够的预算购买 15 名球员,并且我总是每支球队最多有 3 名球员。

**I do not need all possible combinations. **我不需要所有可能的组合。 Just ONE team with the highest possible ROI.只有一个团队具有最高的投资回报率。

This type of question pops up now and again, and it has been solved a couple of times.这类问题时不时出现,已经解决了好几次了。 For example, you have the following (not a duplicate per se, but close enough to get inspired from): Python PuLP Optimization problem for Fantasy Football, how to add Certain Conditional Constraints?例如,您有以下内容(本身不是重复的,但足够接近以从中获得灵感): Python 幻想足球的纸浆优化问题,如何添加某些条件约束? And you have this Medium article: https://medium.com/ml-everything/using-python-and-linear-programming-to-optimize-fantasy-football-picks-dc9d1229db81 And this: https://towardsdatascience.com/how-to-build-a-fantasy-premier-league-team-with-data-science-f01283281236你有这篇中等文章: https://medium.com/ml-everything/using-python-and-linear-programming-to-optimize-fantasy-football-picks-dc9d1229db81这个: https://towardsdatascience. /how-to-build-a-fantasy-premier-league-team-with-data-science-f01283281236

Now, mostly in an attempt to understand it myself, I will attempt a solution (this is just standing on the above solutions, nothing uniquely brilliant here).现在,主要是为了自己理解它,我将尝试一个解决方案(这只是站在上述解决方案上,这里没有什么特别出色的)。 As OP did not provide the data, I went and scraped the first 'Fantasy Football players list' I could find.由于 OP 没有提供数据,我去抓取了我能找到的第一个“梦幻足球运动员名单”。 There is no ROI in that data, however there are 'Points', which we will try to maximize, so I guess OP can apply this to maximize the ROI in his data.该数据中没有投资回报率,但是有“点”,我们将尝试最大化,所以我猜 OP 可以应用它来最大化他数据中的投资回报率。

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time as t
import pandas as pd
from pulp import *

## get some data approximating OP's data
chrome_options = Options()
chrome_options.add_argument("--no-sandbox")

webdriver_service = Service("chromedriver/chromedriver") ## path to where you saved chromedriver binary
browser = webdriver.Chrome(service=webdriver_service, options=chrome_options)

big_df = pd.DataFrame()

url = 'https://fantasy.premierleague.com/player-list/'
browser.get(url)
try:
    WebDriverWait(browser, 20).until(EC.element_to_be_clickable((By.XPATH, "//button[text()='Accept All Cookies']"))).click()
    print('cookies accepted')
except Exception as e:
    print('no cookies for you!')
tables_divs = WebDriverWait(browser, 20).until(EC.presence_of_all_elements_located((By.XPATH, "//table/parent::div/parent::div")))
for t in tables_divs:
    category = t.find_element(By.TAG_NAME, 'h3')
    print(category.text)
    WebDriverWait(t, 20).until(EC.presence_of_all_elements_located((By.XPATH, "//table")))
    dfs = pd.read_html(t.get_attribute('outerHTML'))
    for df in dfs:
        df['Type'] = category.text
        big_df = pd.concat([big_df, df], axis=0, ignore_index=True)
big_df.to_json('f_footie.json')
browser.quit()
footie_df = pd.read_json('f_footie.json')
footie_df.columns = ['Player', 'Team', 'Points', 'Cost', 'Position']
footie_df['Player'] = footie_df.apply( lambda row: row.Player.replace(' ', '_').strip(), axis=1)
footie_df['Cost'] = footie_df.apply( lambda row: row.Cost.split('£')[1], axis=1)
footie_df['Cost'] = footie_df['Cost'].astype('float')
footie_df['Points'] = footie_df['Points'].astype('int')
print(footie_df)
## constraining variables
positions = footie_df.Position.unique()
clubs = footie_df.Team.unique()
budget = 100
available_roles = {
    'Goalkeepers': 2,
    'Defenders': 5,
    'Midfielders': 5,
    'Forwards': 3    
}

names = [footie_df.Player[i] for i in footie_df.index]
teams = [footie_df.Team[i] for i in footie_df.index]
roles = [footie_df.Position[i] for i in footie_df.index]
costs = [footie_df.Cost[i] for i in footie_df.index]
points = [footie_df.Points[i] for i in footie_df.index]
players = [LpVariable("player_" + str(i), cat="Binary") for i in footie_df.index]
prob = LpProblem("Secret Fantasy Player Choices", LpMaximize)
## define the objective -> maximize the points
prob += lpSum(players[i] * points[i] for i in range(len(footie_df)))
## define budget constraint
prob += lpSum(players[i] * footie_df.Cost[footie_df.index[i]] for i in range(len(footie_df))) <= budget

for pos in positions:
  prob += lpSum(players[i] for i in range(len(footie_df)) if roles[i] == pos) <= available_roles[pos]
## add max 3 per team constraint
for club in clubs:
  prob += lpSum(players[i] for i in range(len(footie_df)) if teams[i] == club) <= 3
prob.solve()
df_list = []
for variable in prob.variables():
  if variable.varValue != 0:
    name = footie_df.Player[int(variable.name.split("_")[1])]
    club = footie_df.Team[int(variable.name.split("_")[1])]
    role = footie_df.Position[int(variable.name.split("_")[1])]
    points = footie_df.Points[int(variable.name.split("_")[1])]
    cost = footie_df.Cost[int(variable.name.split("_")[1])]
    df_list.append((name, club, role, points, cost))
    

#     print(name, club, position, points, cost)
result_df = pd.DataFrame(df_list, columns = ['Name', 'Club', 'Role', 'Points', 'Cost'])
result_df.to_csv('win_at_fantasy_football.csv')
print(result_df)

This will display some control printouts, the data scraped, the long printout from pulp solver, and the result dataframe in the end, looking like this:这将显示一些控制打印输出、抓取的数据、纸浆求解器的长打印输出,以及最后的结果 dataframe,如下所示:

Name姓名 Club俱乐部 Role角色 Points积分 Cost成本
0 0 Alisson阿利松 Liverpool利物浦 Goalkeepers守门员 176 176 5.5 5.5
1 1 Lloris洛里斯 Spurs马刺队 Goalkeepers守门员 158 158 5.5 5.5
2 2 Bowen鲍文 West Ham西汉姆 Midfielders中场 206 206 8.5 8.5
3 3 Saka Arsenal兵工厂 Midfielders中场 179 179 8 8
4 4 Maddison麦迪逊 Leicester莱斯特 Midfielders中场 181 181 8 8
5 5 Ward-Prowse沃德-普劳斯 Southampton南安普敦 Midfielders中场 159 159 6.5 6.5
6 6 Gallagher加拉格尔 Chelsea切尔西 Midfielders中场 140 140 6 6
7 7 Antonio安东尼奥 West Ham西汉姆 Forwards前锋 140 140 7.5 7.5
8 8 Toney托尼 Brentford布伦特福德 Forwards前锋 139 139 7 7
9 9 Mbeumo姆贝莫 Brentford布伦特福德 Forwards前锋 119 119 6 6
10 10 Alexander-Arnold亚历山大-阿诺德 Liverpool利物浦 Defenders后卫 208 208 7.5 7.5
11 11 Robertson罗伯逊 Liverpool利物浦 Defenders后卫 186 186 7 7
12 12 Cancelo坎塞洛 Man City曼城 Defenders后卫 201 201 7 7
13 13 Gabriel加布里埃尔 Arsenal兵工厂 Defenders后卫 146 146 5 5
14 14 Cash现金 Aston Villa阿斯顿维拉 Defenders后卫 147 147 5 5

In case there are better solutions, please, whoever is more seasoned in optimisations problems, chime in with critique.如果有更好的解决方案,请在优化问题方面经验丰富的人提出批评。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM