I have the following pandas
DF:
import pandas as pd
mission_df = pd.DataFrame(
{'mission': [1, 2, 3],
'type': ['lift', 'talk', 'run'],
'boundary_low': [2, 3, 3],
'boundary_high': [3, 8, 12]})
I would like to add rows to each field (example mission) such that each row will be filled according to the boundaries with discrete jumps, for example mission 1 has bounderies between 2 and 3, so i need for that mission to add 2 rows with values 2 & 3, as following:
desired_df = pd.DataFrame(
{'mission': [1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3],
'amount': [2, 3, 3, 4, 5, 6, 7, 8, 3, 4, 5, 6],
'type': ['lift', 'lift', 'talk', 'talk', 'talk', 'talk', 'talk', 'talk', 'run', 'run', 'run', 'run'],
'boundary_low': [2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
'boundary_high': [3, 3, 8, 8, 8, 8, 8, 8, 6, 6, 6, 6]})
Thanks in advance!
Try:
mission_df = mission_df.loc[mission_df.index.repeat(mission_df["boundary_high"]-mission_df["boundary_low"] + 1)]
mission_df['amount'] = mission_df.assign(amount=1).groupby(['mission', 'type'])['amount'].cumsum() + mission_df['boundary_low'].sub(1)
# not sure, if relevant for you:
mission_df.reset_index(drop=True, inplace=True)
The key function in here (to make it simple) is:
pd.Index.repeat(n)
, src: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Index.repeat.html
Outputs:
mission type boundary_low boundary_high amount
0 1 lift 2 3 2
1 1 lift 2 3 3
2 2 talk 3 8 3
3 2 talk 3 8 4
4 2 talk 3 8 5
5 2 talk 3 8 6
6 2 talk 3 8 7
7 2 talk 3 8 8
8 3 run 3 12 3
9 3 run 3 12 4
10 3 run 3 12 5
11 3 run 3 12 6
12 3 run 3 12 7
13 3 run 3 12 8
14 3 run 3 12 9
15 3 run 3 12 10
16 3 run 3 12 11
17 3 run 3 12 12
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.