I am trying to loop through over a pandas data frame to meet specific conditions in an optimization task.
Let me provide some backgrounds and what I have done so far.
So the table below is my sample of the top 10 rows of my input data (named df_long
) after loading and melting using pandas. I have 150 rows in my actual dataset.
Hour TypeofTask TaskFrequency TotalTaskatSpecific Hour
0 08 A 5 50
1 09 D 8 30
2 08 D 7 50
3 10 C 4 20
4 09 B 6 30
5 08 B 9 50
6 10 A 2 20
7 09 D 1 30
8 08 C 3 50
9 08 E 2 50
10 09 A 7 30
I have also created decision variables ie x0, x1, x2,..... xn for each row of the above input data set as above using loop statements as below;
decision_variables = []
for rownum, row in df_long.iterrows():
variable = str('x' + str(rownum))
variable = pulp.LpVariable(str(variable), lowBound = 0, cat= 'Integer')
decision_variables.append(variable)
My actual question..
I want to be able to loop through the pandas dataframe to find all the TaskFrequency that happened at a specific hour and then multiply each TaskFrequency by the respective decision variable for each row - the whole expression should be less than or equal to the TotalTaskatSpecificHour for a specific hour eg an expression like this for Hour 10 would be:
4*x3 + 2*x6 <= 20
So far I have been able to do this:
to = ""
for rownum, row in df_long.iterrows():
for i, wo in enumerate(decision_variables):
if rownum == i:
formula = row['TaskFrequency']*wo
to += formula
prob += to
this gave me:
5*x0 + 8*x1 + 7*x2 + 4*x3 + 6*x4 + 9*x5 + 2*x6 + 1*x7 +3*x8 + 2*x9 + 7*x10
I also tried this:
for rownum, row in df_long.iterrows():
for i, wo in enumerate(decision_variables):
for x,y,z in zip(df_long['Hour'],df_long['TypeofTask'],df_long['TaskFrequency']):
if rownum == i:
formula1 = row['TaskFrequency']*wo
I just get 7*x10
what I wish to get is the same expression but for a specific Hour instead of the whole thing combined eg for Hour 10 it should be,
4*x3 + 2*x6 <= 20
for Hour 9 it should be,
8*x1 + 6*x4 + 1*x7 + 7*x10 <= 30
I look forward to your suggestions and help.
Regards
Diva
you would want a return column * (no of hours), in essence you dont need to apply function row by row, but condense the df by groupby like above answer, or slicing: I think groupby is a standard way to do it but lambda is a no brainer.
def fun1(df, Hours, prod):
return sum(df[df['Hour']==Hours].apply(lambda row:int(row.name)*row['TaskFrequency'],axis=1)) <= prod
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.