I have a Pandas dataframe with below columns:
id start end
1 101 101
2 102 104
3 108 109
I want to fill the gaps between start and end with additional rows, so the output may look like this:
id number
1 101
2 102
2 103
2 104
3 108
3 109
Is there anyway to do it in Pandas? Thanks.
Use nested list comprehension with range
and flattening for list of tuples, last use DataFrame
constructor:
zipped = zip(df['id'], df['start'], df['end'])
df = pd.DataFrame([(i, y) for i, s, e in zipped for y in range(s, e+1)],
columns=['id','number'])
print (df)
id number
0 1 101
1 2 102
2 2 103
3 2 104
4 3 108
5 3 109
Here is a pure pandas solution but performance-wise, @jaezrael's solution would be better,
df.set_index('id').apply(lambda x: pd.Series(np.arange(x.start, x.end + 1)), axis = 1)\
.stack().astype(int).reset_index()\
.drop('level_1', 1)\
.rename(columns = {0:'Number'})
id Number
0 1 101
1 2 102
2 2 103
3 2 104
4 3 108
5 3 109
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.