I need to create a new column in dataframe based on intervals from another dataframe. For example, I have a dataframe where in the time column I have values and I want to create column in another dataframe based on the intervals in that time column.
I think a practical example is simpler to understand:
Dataframe with intervals
df1
time value var2
0 1.0 34.0 35.0
1 4.0 754.0 755.0
2 9.0 768.0 769.0
3 12.0 65.0 66.0
Dataframe that I need to filter
df2
time value var2
0 1.0 23.0 23.0
1 2.0 43.0 43.0
2 3.0 76.0 12.0
3 4.0 88.0 22.0
4 5.0 64.0 45.0
5 6.0 98.0 33.0
6 7.0 76.0 11.0
7 8.0 56.0 44.0
8 9.0 23.0 22.0
9 10.0 54.0 44.0
10 11.0 65.0 22.0
11 12.0 25.0 25.0
should result
df3
time value var2 interval
0 1.0 23.0 23.0 1
1 2.0 43.0 43.0 1
2 3.0 76.0 12.0 1
3 4.0 88.0 22.0 1
4 5.0 64.0 45.0 2
5 6.0 98.0 33.0 2
6 7.0 76.0 11.0 2
7 8.0 56.0 44.0 2
8 9.0 23.0 22.0 2
9 10.0 54.0 44.0 3
10 11.0 65.0 22.0 3
11 12.0 25.0 25.0 3
EDIT: As Shubham Sharma said, it's not a filter, I want to add a new column based on intervals in other dataframe.
You can use pd.cut
to categorize the time
in df2
into discrete intervals based on the time in df1
then use Series.factorize
to obtain a numeric array identifying distinct ordered values.
df2['interval'] = pd.cut(df2['time'], df1['time'], include_lowest=True)\
.factorize(sort=True)[0] + 1
Result:
time value var2 interval
0 1.0 23.0 23.0 1
1 2.0 43.0 43.0 1
2 3.0 76.0 12.0 1
3 4.0 88.0 22.0 1
4 5.0 64.0 45.0 2
5 6.0 98.0 33.0 2
6 7.0 76.0 11.0 2
7 8.0 56.0 44.0 2
8 9.0 23.0 22.0 2
9 10.0 54.0 44.0 3
10 11.0 65.0 22.0 3
11 12.0 25.0 25.0 3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.