简体   繁体   English

For 循环重复第一次迭代两次 - python

[英]For loop repeats first iteration twice - python

I am having the following problem in a python script: I have a simple for loop that iterates thru a list of lists and passes 2 parameters for another function to go fetch some data.我在 python 脚本中遇到以下问题:我有一个简单的 for 循环,它遍历列表列表并将另一个 function 的 2 个参数传递给 Z34D1F91FB2E514B8576FAB1A75A89A6 获取一些数据。

Running debug I see the loop works fine thru all 6 items in the for loop without any issues, but then, for some strange reason, it tries to repeat the first pair of parameters once again.运行调试我看到循环通过 for 循环中的所有 6 个项目都可以正常工作,没有任何问题,但是由于某种奇怪的原因,它尝试再次重复第一对参数。

At that point I get a pandas error: "Can only compare identically-labeled Series objects" (the for loop passes parameters to a function that slices a bigger df, though I don´t think its relevant for this issue.) Important to say that first time the loop runs thru that combination, it works fine.那时我收到 pandas 错误:“只能比较标记相同的系列对象”(for 循环将参数传递给 function 切片更大的 df,尽管我认为它与此问题无关。)重要的是要说循环第一次通过该组合运行时,它工作正常。

Anyone has come across anything like this before?以前有人遇到过这样的事情吗?

Trying a graphical explanation:尝试图形解释:

Params = [[a,b],[c,d],[e,f],[g,h],[i,j],[k,l]]
For item in Params:
  df' = df.loc[[df['A'] == item]

What I am saying is par [a,b] goes thru twice, throwing the pandas error in its "second" pass.我所说的是 par [a,b] 两次通过,在其“第二次”传递中抛出 pandas 错误。

Adding a more complete code as requested:根据要求添加更完整的代码:

data = pd.DataFrame ['Contains a datetime index of dates, a column called 'name' with values such as 'A','S','F' and 100 others and a column called 'value' for each date and name], what the code tries to accomplish is slice it down to a leaner df containing a subset of 'name' and 'value' within a certain date range (start,end) so I can use and manipulate more easily elsewhere in my code. data = pd.DataFrame ['包含日期的日期时间索引,名为'name'的列,其值包括'A','S','F'和其他100个值,以及每个日期和名称的'value'列],代码试图完成的是将其切成更精简的 df,其中包含某个日期范围(开始、结束)内的“名称”和“值”的子集,这样我就可以在代码的其他地方更轻松地使用和操作。

pairs = [['A', 'S'], ['A', 'F'], ['S', 'A'], ['S', 'F'], ['F', 'A'], ['F', 'S']], pairs contains all permutations of a subset of columns of "data", in this case, 3 columns selected to go thru 'call_pair_data', hence, 6 permutations.对 = [['A', 'S'], ['A', 'F'], ['S', 'A'], ['S', 'F'], ['F', 'A '], ['F', 'S']], 对包含“数据”列子集的所有排列,在这种情况下,通过“call_pair_data”选择到 go 的 3 列,因此有 6 个排列。

Code itself:代码本身:

for index, item in enumerate(pairs):
 x = item[0]
    y = item[1]
    df = call_pair_data(x, y, start, end)

def call_pair_data(x, y, start, end):
    df_x = data.loc[start : end]
    df_x = df_x.loc[df_x['name'] == x]
    df_y = data.loc[start : end]
    df_y = df_y.loc[df_y['name'] == y]
    pair_df = pd.merge(df_x,df_y, on=['Date'], suffixes=['_x','_y'])
    return(pair_df)

Using the .isin() method would be much simpler and efficient:使用.isin()方法会更简单有效:

for pair in pairs: 
    res_df = df.loc[(df[start:end]) & (df['name'].isin(pair))]

Or或者

for pair in pairs
    res_df = df[start:end].loc[df['name'].isin(pair)]

This method takes a list or tuple as argument, so this would be valid too;此方法将列表或元组作为参数,因此这也是有效的;

for pair in pairs: 
    res_df = df[start: end].loc[df['name'].isin([pair[0], pair[1]])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM