How can i group continuous data (like a tenure column showing different months for each row) into categories shown in a separate column using pandas
Are you looking for something like this?:
import datetime
import pandas as pd
# Make fake data
dates = {"tenure": [datetime.date(2020, 1, 31), datetime.date(2020, 1, 24), datetime.date(2020, 5, 13),
datetime.date(2021, 5, 23), datetime.date(2022, 5, 5), datetime.date(2020, 3, 16),
datetime.date(2020, 5, 28), datetime.date(2020, 9, 23), datetime.date(2020, 12, 28),
datetime.date(2021, 10, 12)]}
df = pd.DataFrame(data=dates)
tenure |
---|
2020-01-31 |
2020-01-24 |
2020-05-13 |
2021-05-23 |
2022-05-05 |
2020-03-16 |
2020-05-28 |
2020-09-23 |
2020-12-28 |
2021-10-12 |
# Make months to group by
df["tenure"] = pd.to_datetime(df.tenure)
df["month"] = df.tenure.dt.month_name()
tenure | month |
---|---|
2020-01-31 00:00:00 | January |
2020-01-24 00:00:00 | January |
2020-05-13 00:00:00 | May |
2021-05-23 00:00:00 | May |
2022-05-05 00:00:00 | May |
2020-03-16 00:00:00 | March |
2020-05-28 00:00:00 | May |
2020-09-23 00:00:00 | September |
2020-12-28 00:00:00 | December |
2021-10-12 00:00:00 | October |
# Group by months and show "different months for each row"
df = (df
.sort_values("tenure")
.groupby("month")["tenure"]
.apply(lambda x: x.reset_index(drop=True))
.unstack()
.reset_index())
month | 0 | 1 | 2 | 3 |
---|---|---|---|---|
December | 2020-12-28 00:00:00 | NaT | NaT | NaT |
January | 2020-01-24 00:00:00 | 2020-01-31 00:00:00 | NaT | NaT |
March | 2020-03-16 00:00:00 | NaT | NaT | NaT |
May | 2020-05-13 00:00:00 | 2020-05-28 00:00:00 | 2021-05-23 00:00:00 | 2022-05-05 00:00:00 |
October | 2021-10-12 00:00:00 | NaT | NaT | NaT |
September | 2020-09-23 00:00:00 | NaT | NaT | NaT |
OR perhaps something like this?:
# Group by months and show "different months for each row"
df = (df.sort_values("tenure")
.groupby("month")["tenure"]
.apply(lambda x: x.reset_index(drop=True))
.unstack()
.reset_index()
.T)
df = df.rename(columns=df.iloc[0]).drop(df.index[0]).reset_index(drop=True)
December | January | March | May | October | September |
---|---|---|---|---|---|
2020-12-28 00:00:00 | 2020-01-24 00:00:00 | 2020-03-16 00:00:00 | 2020-05-13 00:00:00 | 2021-10-12 00:00:00 | 2020-09-23 00:00:00 |
NaT | 2020-01-31 00:00:00 | NaT | 2020-05-28 00:00:00 | NaT | NaT |
NaT | NaT | NaT | 2021-05-23 00:00:00 | NaT | NaT |
NaT | NaT | NaT | 2022-05-05 00:00:00 | NaT | NaT |
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.