简体   繁体   中英

For loop repeats first iteration twice - python

I am having the following problem in a python script: I have a simple for loop that iterates thru a list of lists and passes 2 parameters for another function to go fetch some data.

Running debug I see the loop works fine thru all 6 items in the for loop without any issues, but then, for some strange reason, it tries to repeat the first pair of parameters once again.

At that point I get a pandas error: "Can only compare identically-labeled Series objects" (the for loop passes parameters to a function that slices a bigger df, though I don´t think its relevant for this issue.) Important to say that first time the loop runs thru that combination, it works fine.

Anyone has come across anything like this before?

Trying a graphical explanation:

Params = [[a,b],[c,d],[e,f],[g,h],[i,j],[k,l]]
For item in Params:
  df' = df.loc[[df['A'] == item]

What I am saying is par [a,b] goes thru twice, throwing the pandas error in its "second" pass.

Adding a more complete code as requested:

data = pd.DataFrame ['Contains a datetime index of dates, a column called 'name' with values such as 'A','S','F' and 100 others and a column called 'value' for each date and name], what the code tries to accomplish is slice it down to a leaner df containing a subset of 'name' and 'value' within a certain date range (start,end) so I can use and manipulate more easily elsewhere in my code.

pairs = [['A', 'S'], ['A', 'F'], ['S', 'A'], ['S', 'F'], ['F', 'A'], ['F', 'S']], pairs contains all permutations of a subset of columns of "data", in this case, 3 columns selected to go thru 'call_pair_data', hence, 6 permutations.

Code itself:

for index, item in enumerate(pairs):
 x = item[0]
    y = item[1]
    df = call_pair_data(x, y, start, end)

def call_pair_data(x, y, start, end):
    df_x = data.loc[start : end]
    df_x = df_x.loc[df_x['name'] == x]
    df_y = data.loc[start : end]
    df_y = df_y.loc[df_y['name'] == y]
    pair_df = pd.merge(df_x,df_y, on=['Date'], suffixes=['_x','_y'])
    return(pair_df)

Using the .isin() method would be much simpler and efficient:

for pair in pairs: 
    res_df = df.loc[(df[start:end]) & (df['name'].isin(pair))]

Or

for pair in pairs
    res_df = df[start:end].loc[df['name'].isin(pair)]

This method takes a list or tuple as argument, so this would be valid too;

for pair in pairs: 
    res_df = df[start: end].loc[df['name'].isin([pair[0], pair[1]])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM