How to add columns and data of a new dataframe to an already existing one in python

Question

I have a dataframe called "df" and in that dataframe there is a column called "Year_Birth", instead of that column I want to create multiple columns of specific age categories and on each element in that dataframe I calculate the age using the previous Year_Birth column and then put value of "True" or "1" on the age category that the element belongs to.

I am doing this manually as you can see:

#Splitting the Year and Income Attribute to categories

from datetime import date

df_year = pd.DataFrame(columns=['18_29','30_39','40_49','50_59','60_plus'])
temp = df.Year_Birth
current_year = current_year = date.today().year
for x in temp:
  l = [0,0,0,0,0]
  age = current_year - x
  if (age<=29): l[0] = 1
  elif (age<=39): l[1] = 1
  elif (age<=49): l[2] = 1
  elif (age<=59): l[3] = 1
  else: l[4] = 1
  df_length = len(df_year)
  df_year.loc[df_length] = l

if there's an automatic or simpler way to do this please tell me, anyway, Now I want to replace the "Year_Birth" column with the whole "df_year" dataframe ! Can you help me with that ?

Answer 1

You can definitely do this using vectorized operations on each column. You can start by creating an age column from the year of birth:

In [15]: age = date.today().year - df.year_birth

now, this can be used with boolean operators to create arrays of True/False values, which can be coerced to 0/1 with .astype(int) :

In [20]: df_year = pd.DataFrame({
    ...:      '18_29': (age >= 18) & (age <= 29),
    ...:      '30_39': (age >= 30) & (age <= 39),
    ...:      '40_49': (age >= 40) & (age <= 49),
    ...:      '50_59': (age >= 50) & (age <= 59),
    ...:      '60_plus': (age >= 60),
    ...: }).astype(int)

In [21]: df_year
Out[21]:
    18_29  30_39  40_49  50_59  60_plus
0       0      0      0      0        0
1       0      0      0      0        0
2       0      0      0      0        0
3       0      0      0      0        0
4       0      0      0      0        0
..    ...    ...    ...    ...      ...
77      0      0      0      0        1
78      0      0      0      0        1
79      0      0      0      0        1
80      0      0      0      0        1
81      0      0      0      0        1

[82 rows x 5 columns]

How to add columns and data of a new dataframe to an already existing one in python

Question

1 answers

solution1
0 2022-05-25 00:44:09

How to add columns and data of a new dataframe to an already existing one in python

Question

1 answers

solution1 0 2022-05-25 00:44:09

solution1
0 2022-05-25 00:44:09