Looping Columns in Dataframes Python3

Question

I am wondering if you would be able to do a loop based on the following scenario (because I was trying to do it, but couldn't seem to figure it out).

My dataframe has the following headers:

female2['fiscal_year','ratio_loanofficers', 'ratio_female_borrowers', 'ratio_male_borrowers']

What I'm trying to do is to get the mean of each ratio for each year under fiscal_year . So, I might have to do three loops, each for loan officers, female borrowers and male borrowers. Under fiscal_year , there are multiple 2010, 2011,...2019 entries. So what I actually did to achieve my answer is create a database subset and group by per year and then get the mean. It worked...but I wanted to see if a loop would work (I'm very new to Python).

This was my initial code:

for y in range(2010, 2020):
    if y == 2010:
        loan2010 += round(female2.ratio_floanofficers.mean(), 3)
    elif y == 2011:
        loan2011 += round(female2.ratio_floanofficers.mean(), 3)
    elif y == 2012:
        loan2012 += round(female2.ratio_floanofficers.mean(), 3)
    elif y == 2013:
        loan2013 += round(female2.ratio_floanofficers.mean(), 3)
    elif y == 2014:
        loan2014 += round(female2.ratio_floanofficers.mean(), 3)
    elif y == 2015:
        loan2015 += round(female2.ratio_floanofficers.mean(), 3)
    elif y == 2016:
        loan2016 += round(female2.ratio_floanofficers.mean(), 3)
    elif y == 2017:
        loan2017 += round(female2.ratio_floanofficers.mean(), 3)
    elif y == 2018:
        loan2018 += round(female2.ratio_floanofficers.mean(), 3)
    else:
        loan2019 += round(female2.ratio_floanofficers.mean(), 3)

print(loan2010, loan2011, loan2012, loan2013, loan2014, loan2015, loan2016, loan2017, loan2018, loan2019)

What I got, however, was the same result for each year, which indicated to me that the loop wasn't working as I wanted it to.

Thanks!

Answer 1

round(female2.ratio_floanofficers.mean() is the same for every loop, so you will get the same value for each year. y hasn't been used to select the data corresponding to a particular year.
- If using a for-loop is required, replace round(female2.ratio_floanofficers.mean() with the following
- round(female2[female2.fiscal_year == y]['ratio_loanofficers'].mean(), 3)
When there is a need to create many variables, use a dict
f'loan{year}' is an f-String: A New and Improved Way to Format Strings in Python
- PEP 498 - Literal String Interpolation
{year: 'some value' for year in range(2010, 2020)} is a dictionary comprehension
female2[female2.fiscal_year == year] is Boolean indexing

import pandas as pd

# dataframe
female2 = pd.DataFrame({'fiscal_year': [2018, 2018, 2018, 2018, 2019, 2019, 2019, 2019],
                        'ratio_female_borrowers': [1, 2, 3, 4, 5, 6, 7, 8]})

   fiscal_year  ratio_female_borrowers
0         2018                       1
1         2018                       2
2         2018                       3
3         2018                       4
4         2019                       5
5         2019                       6
6         2019                       7
7         2019                       8

# calculate mean for loan year into dict
loans = {f'loan{year}': round(female2[female2.fiscal_year == year]['ratio_female_borrowers'].mean(), 3) for year in range(2010, 2020)}

print(loans)

{'loan2010': nan,
 'loan2011': nan,
 'loan2012': nan,
 'loan2013': nan,
 'loan2014': nan,
 'loan2015': nan,
 'loan2016': nan,
 'loan2017': nan,
 'loan2018': 2.5,
 'loan2019': 6.5}

print(loans['loan2019'])

>>> 6.5

Equivalent `for-loop` for the `dict comprehension`

loans = dict()

for year in range(2010, 2020):
    loans[f'loan{year}'] = round(female2[female2.fiscal_year == year]['ratio_female_borrowers'].mean(), 3)

Use `pandas.DataFrame.groupby`

ratio_female_borrowers_mean = female2.groupby(['fiscal_year'], as_index=False)['ratio_female_borrowers'].agg(['mean'])

print(ratio_female_borrowers_mean)

             mean
fiscal_year      
2018          2.5
2019          6.5

Looping Columns in Dataframes Python3

Question

1 answers

solution1
0 ACCPTED 2020-05-12 20:08:43

Equivalent `for-loop` for the `dict comprehension`

Use `pandas.DataFrame.groupby`

Looping Columns in Dataframes Python3

Question

1 answers

solution1 0 ACCPTED 2020-05-12 20:08:43

Equivalent for-loop for the dict comprehension

Use pandas.DataFrame.groupby

solution1
0 ACCPTED 2020-05-12 20:08:43

Equivalent `for-loop` for the `dict comprehension`

Use `pandas.DataFrame.groupby`