简体   繁体   中英

multiple for loop to select special rows from a dataframe in python

I have a large data frame in python and I want to select specific rows based on multiple for loops. Some columns contain lists in them. My final goal is to generate some optimization constraints and pass them through another software:

   T        S        W     Arrived    Departed     
  [1,2]    [4,2]     1        8          10
  [3,4,5]   [3]      1        12         18
  [6,7]    [1,2]     2        10         11
    .        .       .        .          .
    .        .       .        .          .

  def Cons(row):

    if row['W'] == w and sum(pd.Series(row['T']).isin([t])) != 0 and sum(pd.Series(row['S']).isin([s])) != 0:
           return 1

  for w in range(50):
      for s in range(30):
          for t in range(12):
              df.Situation = df.apply(Cons, axis = 1)
              A = df[ (df.Situation == 1) ] 
              A1 = pd.Series(A.Arrived).tolist()
              D1 = pd.Series(A.Departed).tolist()
              Time = tuplelist(zip(A1,D1))

How can I efficiently do this because going through multiple for loops takes a long time to run?

Currently, you are constantly adjusting your dataframe with each nested loop where A is re-written over each time and does not yield a growing result but only of the very, very last iteration.

But consider creating a cross join of all ranges and then checking the equality logic:

wdf = pd.DataFrame({'w': range(50), 'key': 1})
sdf = pd.DataFrame({'s': range(30), 'key': 1})
tdf = pd.DataFrame({'t': range(12), 'key': 1})

dfs = [wdf, sdf, tdf]

# DATA FRAME OF CROSS PRODUCT w X s X T (N = 18,000)
rangedf = reduce(lambda left,right: pd.merge(left, right, on=['key']), dfs)[['w','s','t']]
#    w  s  t
# 0  0  0  0
# 1  0  0  1
# 2  0  0  2
# 3  0  0  3
# 4  0  0  4
# ...

def Cons(row):    
    if any((rangedf['w'].isin([row['W']])) & (rangedf['t'].isin([row['T']])) & \
           (rangedf['s'].isin([row['S']]))) == True:
        return 1

df.Situation = df.apply(Cons, axis = 1)
A = df[ (df.Situation == 1) ].reset_index(drop=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM