How can I apply a function to a dataframe which needs a row index in Pandas?

Question

I have to use survey data from ipums to get the average number of people who are unemployed in two successive periods. I wrote a function that uses an index and a dataframe as input,

def u1(x,df):
if df.loc[x]['LABFORCE']==2 and df.loc[x]['CPSIDP']==df.loc[x+1]['CPSIDP']:
    if df.loc[x]['EMPSTAT']==21 or df.loc[x]['EMPSTAT']==22:
        return True
else: 
    return False

where x is the index and df is the dataframe. CPSIDP identifies the survey respondent, LABFORCE checks the respondent is in the labor force and EMPSTAT is what I need to use to check the employment status of the respondent.

And then I planned to use apply as

result= df.apply(u1, axis=1)

It is not clear what arguments I should pass in my function (and please let me know if this approach is just philosophically wrong). Passing a number or a variable for the index gives me a 'bool' object is not callable error.

The smallest dataframe subset that generates the error (left most column is the number of the observation, it is the x I need to pass through u1 ):

          YEAR  MONTH          CPSIDP  EMPSTAT  LABFORCE
15285896  2018      7  20180707096701       10         2
15285926  2018      7  20180707098301       10         2
15285927  2018      7  20180707098302       10         2
15285928  2018      7  20180707098303        0         0
15285929  2018      7  20180707098304        0         0
15285930  2018      7  20180707098305       10         2
15286095  2018      7  20180707108203       21         2

Answer 1

IIUC it would be more efficient to create a boolean Series using the logic from your function.

Here & is the AND operator.

result = (df['LABFORCE'].eq(2) & 
           df['CPSIDP'].eq(df['CPSIDP'].shift()) & 
           df['EMPSTAT'].isin([21,22]))

result

15285896    False
15285926    False
15285927    False
15285928    False
15285929    False
15285930    False
15286095    False

How can I apply a function to a dataframe which needs a row index in Pandas?

Question

1 answers

solution1
1 ACCPTED 2018-09-28 21:08:29

How can I apply a function to a dataframe which needs a row index in Pandas?

Question

1 answers

solution1 1 ACCPTED 2018-09-28 21:08:29

solution1
1 ACCPTED 2018-09-28 21:08:29