I am trying to optimize a grouping / selection of trial members with limited space, and am running into some trouble. I have the pandas data frames ready for optimization, and can run the linear optimization with no problems, except for one constraint I need to add. I am trying to use binaries for selection (but I am not tied to that for any reason, so if a different method would resolve this, I could switch) from a large list. I need to minimize combined trial time for selection in the next round of trials, but some subjects already ran multiple trials due to the nature of the experiment. I would like to select the best combination of subjects based on minimizing time, but allow some subjects to be in the list multiple times for the optimization (so I do not have to manually remove them beforehand). For instance:
Name Trial ID Time (ms) Selected?
Mary Smith A 11001 33 1
John Doe A 11002 24 0
James Smith B 11003 52 0
Stacey Doe A 11004 21 1
John Doe B 11002 19 1
Is there some way to allow 2 John Doe entries for the optimization but constrain the output to only one selection of him? Thanks for your time!
If you have a requirement to record all the values you want to remove, you could use the duplicated
function, like this
# First sort your dataframe
df.sort_values(['Name', 'Time (ms)'], inplace=True)
# Make a new column of duplicated values based only on name
df['duplicated'] = df.duplicated(subset=['Name'])
# You can then access the duplicates, but still have a log of the rejects
df.query('not duplicated')
# Name Trial ID Time (ms) Selected? duplicated
# 2 James Smith B 11003 52 0 False
# 1 John Doe A 11002 24 0 False
# 0 Mary Smith A 11001 33 1 False
# 3 Stacey Doe A 11004 21 1 False
df.query('duplicated')
# Name Trial ID Time (ms) Selected? duplicated
# 4 John Doe B 11002 19 1 True
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.