繁体   English   中英

内核在使用纸浆求解器的 Jupyter 笔记本中不断死亡

[英]Kernel keeps dying in Jupyter notebook with pulp solver

我在 Jupyter 笔记本中创建了一个 LP 求解器,这给了我一些问题。 具体来说,当我在下面的脚本中运行最后一行代码时,我收到错误消息说The kernel appears to have died. It will restart automatically. The kernel appears to have died. It will restart automatically.

编辑:最终的数据帧dfs_proj是一个 240 行、5 列的数据帧。

import pandas as pd
from pulp import *
from pulp import LpMaximize

dfs_proj = pd.read_csv("4for4_dfs_projections_120321.csv")
dfs_proj['count'] = 1
cols = ['Player', 'Pos', 'FFPts', 'DK ($)', 'count']
dfs_proj = dfs_proj[cols]
dfs_proj = dfs_proj[(dfs_proj['DK ($)'] >= 4000) | (dfs_proj['Pos'] == "DEF") | (dfs_proj['Pos'] == "TE")]

player_dict = dict(zip(dfs_proj['Player'], dfs_proj['count']))

# create a helper function to return the number of players assigned each position
def get_position_sum(player_vars, df, position):
    return pulp.lpSum([player_vars[i] * (position in df['Pos'].iloc[i]) for i in range(len(df))])

def get_optimals(site, data, num_lineups, optimize_on='FFPts'):
    """
    Generates x number of optimal lineups, based on the column to
    designate as the one to optimize on.
    :param str site: DK or FD. Used for salary constraints
    :param pd.DataFrame data: Pandas dataframe containing projections.
    :param int num_lineups: Number of lineups to generate.
    :param str optimize_on: Name of column in dataframe to use when optimizing
    """
    #global lineups
    lineups = []
    player_dict = dict(zip(data['Player'], data['count']))
    for i in range(1, num_lineups+1):
        prob = pulp.LpProblem('DK_NFL_weekly', pulp.const.LpMaximize)
        player_vars = []
        for row in data.itertuples():
            var = pulp.LpVariable(f'{row.Player}', cat='Binary')
            player_vars.append((row.Player, var))
        # total assigned players constraint
        prob += pulp.lpSum(player_var for player_var in player_vars) == 9
        # total salary constraint
        prob += pulp.lpSum(data['DK ($)'].iloc[i] * player_vars[i][1] for i in range(len(data))) <= 50000
        # for QB and DST, require 1 of each in the lineup
        prob += get_position_sum(player_vars, df, 'QB') == 1
        prob += get_position_sum(player_vars, df, 'DEF') == 1
        
        # to account for the FLEX position, we allow additional selections of the 3 FLEX-eligible positions: RB, WR, TE
        prob += get_position_sum(player_vars, df, 'RB') >= 2
        prob += get_position_sum(player_vars, df, 'WR') >= 3
        prob += get_position_sum(player_vars, df, 'TE') >= 1
        if i > 1:
            if optimize_on == 'Optimal Frequency':
                prob += pulp.lpSum([data['FFPts'].iloc[i] * player_vars[i][1] for i in range(len(data))]) <= (optimal - 0.001)
            else:
                prob += pulp.lpSum([data['FFPts'].iloc[i] * player_vars[i][1] for i in range(len(data))]) <= (optimal - 0.01)
        
        prob += pulp.lpSum([data['FFPts'].iloc[i] * player_vars[i][1] for i in range(len(data))])
        # solve and print the status
        prob.solve(PULP_CBC_CMD(msg=False))
        optimal = prob.objective.value()
        count = 1
        lineup = {}
        for i in range(len(data)):    
            if player_vars[i][1].value() == 1:
                row = data.iloc[i]
                lineup[f'G{count}'] = row['Player']
                count += 1
            lineup['Total Points'] = optimal
        
        lineups.append(lineup)
        players = list(lineup.values())
        for i in range(0, len(players)):
            if type(players[i]) == str:
                player_dict[players[i]] += 1
                if player_dict[players[i]] == 45:
                    data = data[data['Player'] != players[i]]
    return lineups

lineups = get_optimals(dfs_proj, 20, 'FFPts')

我尝试重新安装脚本中使用的所有库,但仍然遇到同样的问题。 即使在普通的 Python 脚本中运行它也会给我同样的错误消息。 我认为这可能与记忆有关,但我也不确定如何检查或调整。

提前感谢您的帮助!

您在这里有一些拼写错误...不确定您是否/如何运行它。

你有几个问题:

  • 您在函数中混合了dfdata变量名称。 所以谁知道那是什么。(在笔记本上工作的危险之一。)
  • 在您使用player_vars的几个位置,您没有索引元组来获取变量片段,我建议您对这些使用LpVariable.dicts() ,这样更容易管理。
  • 您的函数调用不考虑函数参数中的site

其他建议:

  • 不要关闭消息传递。 您必须检查求解器输出以查看状态。 第一次尝试返回“不可行”,这就是我发现player_vars问题的方式。 如果您确实决定关闭消息,请找出一种assert(status==optimal)或冒险垃圾结果的方法。 我认为它在pulp中是可行的,我只是忘记了如何。 编辑:这里是如何。 这在使用默认的 CBC 求解器时有效,在求解后(显然)。 其他求解器,不确定:

     status = LpStatus[prob.status] assert(status=='Optimal')
  • 将问题打印几次,看看它是否在构建时通过了傻笑测试。 如果你这样做了,你会看到一些施工问题。

无论如何,这对于虚假数据来说效果很好,并且可以在几秒钟内处理 20 个阵容的 1000 多名球员。

买家注意:我没有仔细审查所有约束或条件约束,所以你应该这样做。

import pandas as pd
from pulp import *
# from pulp import LpMaximize
from random import randint, choice

num_players = 1000
positions = ['RB', 'WR', 'TE', 'DEF', 'QB']
players = [(i, choice(positions), randint(1,100), randint(3000,5000), 1) for i in range(num_players)]
cols = ['Player', 'Pos', 'FFPts', 'DK ($)', 'count']
dfs_proj = pd.DataFrame.from_records(players, columns = cols)
print(dfs_proj.head())


# dfs_proj = pd.read_csv("4for4_dfs_projections_120321.csv")
# dfs_proj['count'] = 1
# cols = ['Player', 'Pos', 'FFPts', 'DK ($)', 'count']
# dfs_proj = dfs_proj[cols]

dfs_proj = dfs_proj[(dfs_proj['DK ($)'] >= 4000) | (dfs_proj['Pos'] == "DEF") | (dfs_proj['Pos'] == "TE")]

# player_dict = dict(zip(dfs_proj['Player'], dfs_proj['count']))

print(dfs_proj.head())

# create a helper function to return the number of players assigned each position
def get_position_sum(player_vars, df, position):
    return pulp.lpSum([player_vars[i][1] * (position in df['Pos'].iloc[i]) for i in range(len(df))])  #player vars not indexed

#def get_optimals(site, data, num_lineups, optimize_on='FFPts'):   # site???  # data vs df ???
def get_optimals(data, num_lineups, optimize_on='FFPts'):
    """
    Generates x number of optimal lineups, based on the column to
    designate as the one to optimize on.
    :param str site: DK or FD. Used for salary constraints
    :param pd.DataFrame data: Pandas dataframe containing projections.
    :param int num_lineups: Number of lineups to generate.
    :param str optimize_on: Name of column in dataframe to use when optimizing
    """
    #global lineups
    lineups = []
    player_dict = dict(zip(data['Player'], data['count']))
    for i in range(1, num_lineups+1):
        prob = pulp.LpProblem('DK_NFL_weekly', pulp.const.LpMaximize)
        player_vars = []
        for row in data.itertuples():
            var = pulp.LpVariable(f'P{row.Player}', cat='Binary')  # added 'P' to player name for clarity
            player_vars.append((row.Player, var))
        # total assigned players constraint
        prob += pulp.lpSum(player_var[1] for player_var in player_vars) == 9    # player var not indexed
        # total salary constraint
        prob += pulp.lpSum(data['DK ($)'].iloc[i] * player_vars[i][1] for i in range(len(data))) <= 50000
        # for QB and DST, require 1 of each in the lineup

        # !!!!  you had 'df' here which who knows what you were pulling in....  changed to data

        prob += get_position_sum(player_vars, data, 'QB') == 1
        prob += get_position_sum(player_vars, data, 'DEF') == 1
        
        # to account for the FLEX position, we allow additional selections of the 3 FLEX-eligible positions: RB, WR, TE
        prob += get_position_sum(player_vars, data, 'RB') >= 2
        prob += get_position_sum(player_vars, data, 'WR') >= 3
        prob += get_position_sum(player_vars, data, 'TE') >= 1
        if i > 1:
            if optimize_on == 'Optimal Frequency':
                prob += pulp.lpSum([data['FFPts'].iloc[i] * player_vars[i][1] for i in range(len(data))]) <= (optimal - 0.001)
            else:
                prob += pulp.lpSum([data['FFPts'].iloc[i] * player_vars[i][1] for i in range(len(data))]) <= (optimal - 0.01)
        
        prob += pulp.lpSum([data['FFPts'].iloc[i] * player_vars[i][1] for i in range(len(data))])
        print(prob)
        # solve and print the status
        prob.solve(PULP_CBC_CMD())
        optimal = prob.objective.value()
        count = 1
        lineup = {}
        for i in range(len(data)):    
            if player_vars[i][1].value() == 1:
                row = data.iloc[i]
                lineup[f'G{count}'] = row['Player']
                count += 1
            lineup['Total Points'] = optimal
        
        lineups.append(lineup)
        players = list(lineup.values())
        for i in range(0, len(players)):
            if type(players[i]) == str:
                player_dict[players[i]] += 1
                if player_dict[players[i]] == 45:
                    data = data[data['Player'] != players[i]]
    return lineups

lineups = get_optimals(dfs_proj, 10, 'FFPts')
for lineup in lineups:
    print(lineup)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM