简体   繁体   English

多个for循环从python中的数据框中选择特殊行

[英]multiple for loop to select special rows from a dataframe in python

I have a large data frame in python and I want to select specific rows based on multiple for loops. 我在python中有一个大数据框,我想根据多个for循环选择特定的行。 Some columns contain lists in them. 有些列包含列表。 My final goal is to generate some optimization constraints and pass them through another software: 我的最终目标是生成一些优化约束并将它们传递给另一个软件:

   T        S        W     Arrived    Departed     
  [1,2]    [4,2]     1        8          10
  [3,4,5]   [3]      1        12         18
  [6,7]    [1,2]     2        10         11
    .        .       .        .          .
    .        .       .        .          .

  def Cons(row):

    if row['W'] == w and sum(pd.Series(row['T']).isin([t])) != 0 and sum(pd.Series(row['S']).isin([s])) != 0:
           return 1

  for w in range(50):
      for s in range(30):
          for t in range(12):
              df.Situation = df.apply(Cons, axis = 1)
              A = df[ (df.Situation == 1) ] 
              A1 = pd.Series(A.Arrived).tolist()
              D1 = pd.Series(A.Departed).tolist()
              Time = tuplelist(zip(A1,D1))

How can I efficiently do this because going through multiple for loops takes a long time to run? 我怎样才能有效地执行此操作,因为通过多个for循环需要很长时间才能运行?

Currently, you are constantly adjusting your dataframe with each nested loop where A is re-written over each time and does not yield a growing result but only of the very, very last iteration. 目前,您不断调整每个嵌套循环的数据帧,其中每次重写A并且不会产生增长的结果,而只会产生最后一次迭代。

But consider creating a cross join of all ranges and then checking the equality logic: 但是考虑创建所有范围的交叉连接,然后检查相等逻辑:

wdf = pd.DataFrame({'w': range(50), 'key': 1})
sdf = pd.DataFrame({'s': range(30), 'key': 1})
tdf = pd.DataFrame({'t': range(12), 'key': 1})

dfs = [wdf, sdf, tdf]

# DATA FRAME OF CROSS PRODUCT w X s X T (N = 18,000)
rangedf = reduce(lambda left,right: pd.merge(left, right, on=['key']), dfs)[['w','s','t']]
#    w  s  t
# 0  0  0  0
# 1  0  0  1
# 2  0  0  2
# 3  0  0  3
# 4  0  0  4
# ...

def Cons(row):    
    if any((rangedf['w'].isin([row['W']])) & (rangedf['t'].isin([row['T']])) & \
           (rangedf['s'].isin([row['S']]))) == True:
        return 1

df.Situation = df.apply(Cons, axis = 1)
A = df[ (df.Situation == 1) ].reset_index(drop=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM