简体   繁体   中英

Try/Except issue Python 2.7

This is the very first question I ask here, so I hope I'll be clear enough :)

So I'm trying to write an outlier function , which takes 3 arguments:

-df: a Pandas dataframe

-L: a list containing some of this dataframe's columns

-threshold: a threshold we can choose, knowing that I'm using the z_score method in this function.

Here is the function I'm trying to implement:

def out1(df,L,threshold):
    liste=[]
    for i in L:
        dico={}
        try:
            dico['Column Name']=i
            dico['Number of 
outliers']=len(np.where(np.abs(stats.zscore(df[L[i]])>threshold))[0])
            dico['Top 10 outliers']='a' #I'll fill this later
            dico['Exception']=None
        except Exception as e:
            dico['Exception']=str(e)
        liste.append(dico)
    return(liste)

I have to use an exception here because not all the columns of df are necessarily numerical (so L can contain columns names that are not numerical) and thus it would be non-sense to use the z_score method and look for outliers in those columns.

However, I tried to run this code with:

-df: a simple dataframe I have

-L=['Terminations'] (a numerical column of my dataframe df)

-threshold=2

And this is what Python2.7 returns:

Out[8]: 
[{'Column Name': 'Terminations',
  'Exception': 'list indices must be integers, not str'}]

Although I'm not even sure if this has something to do with the Try...Except, I could really use any help to solve my problem !

Thank you in advance,

Alex

EDIT: I haven't really made clear what I was expecting as an output.

Let's say the argument L only contains 1 element:

So L=['One column name of df']

Either this column is numerical (so I want to apply the z_score method), either it is not (so I want to raise an exception).

If this column is numerical, the output would be:

[{'Column Name': 'One column name of df'; 'Number of outliers': xxx; 'Top 10 outliers': [I'll make it a liste later]; 'Exception': None}]

If the column is not numerical, it would be:

[{'Column Name': 'One column name of df'; 'Number of outliers': None; 'Top 10 outliers: None, 'Exception': 'The column you chose is not numerical}]

for i in L: will generate column names (strings) into i (not indices!). Later you have L[i] , which is redundant and wrong, and the cause for the "list indices must be integers, not str" exception.

As a teachable moment, it is a good time to suggest better variable naming - if you wrote for column_name in column_names: instead, it would likely not have occured to you to write column_names[column_name] . :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM