简体   繁体   中英

Conditional Iteration over a Pandas Dataframe

I am trying to loop through over a pandas data frame to meet specific conditions in an optimization task.

Let me provide some backgrounds and what I have done so far.

So the table below is my sample of the top 10 rows of my input data (named df_long ) after loading and melting using pandas. I have 150 rows in my actual dataset.

   Hour TypeofTask TaskFrequency  TotalTaskatSpecific Hour
0   08    A             5         50
1   09    D             8         30
2   08    D             7         50
3   10    C             4         20
4   09    B             6         30
5   08    B             9         50
6   10    A             2         20
7   09    D             1         30
8   08    C             3         50
9   08    E             2         50
10  09    A             7         30

I have also created decision variables ie x0, x1, x2,..... xn for each row of the above input data set as above using loop statements as below;

decision_variables = []
for rownum, row in df_long.iterrows():
    variable = str('x' + str(rownum))
    variable = pulp.LpVariable(str(variable), lowBound = 0, cat= 'Integer') 
    decision_variables.append(variable)

My actual question..

I want to be able to loop through the pandas dataframe to find all the TaskFrequency that happened at a specific hour and then multiply each TaskFrequency by the respective decision variable for each row - the whole expression should be less than or equal to the TotalTaskatSpecificHour for a specific hour eg an expression like this for Hour 10 would be:

4*x3 + 2*x6 <= 20

So far I have been able to do this:

to = ""
for rownum, row in df_long.iterrows():
    for i, wo in enumerate(decision_variables):
            if rownum == i:
                formula = row['TaskFrequency']*wo
    to += formula
prob += to

this gave me:

5*x0 + 8*x1 + 7*x2 + 4*x3 + 6*x4 + 9*x5 + 2*x6 + 1*x7 +3*x8 + 2*x9 + 7*x10

I also tried this:

for rownum, row in df_long.iterrows():
            for i, wo in enumerate(decision_variables):
                 for x,y,z in zip(df_long['Hour'],df_long['TypeofTask'],df_long['TaskFrequency']):
                           if rownum == i:
                                formula1 = row['TaskFrequency']*wo 

I just get 7*x10

what I wish to get is the same expression but for a specific Hour instead of the whole thing combined eg for Hour 10 it should be,

4*x3 + 2*x6 <= 20

for Hour 9 it should be,

8*x1 + 6*x4 + 1*x7 + 7*x10 <= 30

I look forward to your suggestions and help.

Regards

Diva

you would want a return column * (no of hours), in essence you dont need to apply function row by row, but condense the df by groupby like above answer, or slicing: I think groupby is a standard way to do it but lambda is a no brainer.

def fun1(df, Hours, prod):
   return sum(df[df['Hour']==Hours].apply(lambda row:int(row.name)*row['TaskFrequency'],axis=1)) <= prod 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM