Repeating rows of a dataframe based on a column value

Question

I have a data frame like this:

df1 = pd.DataFrame({'a': [1,2],
                    'b': [3,4],
                    'c': [6,5]})
  df1
Out[150]: 
   a  b  c
0  1  3  6
1  2  4  5

Now I want to create a df that repeats each row based on difference between col b and c plus 1. So diff between b and c for first row is 6-3 = 3. I want to repeat that row 3+1=4 times. Similarly for second row the difference is 5-4 = 1, so I want to repeat it 1+1=2 times. The column d is added to have value from min(b) to diff between b and c (ie6-3 = 3. So it goes from 3->6). So I want to get this df:

  a  b  c d
0 1  3  6 3
0 1  3  6 4
0 1  3  6 5
0 1  3  6 6
1 2  4  5 4
1 2  4  5 5

Answer 1

Do it with reindex + repeat , then using groupby cumcount assign the new value d

df1.reindex(df1.index.repeat(df1.eval('c-b').add(1))).\
      assign(d=lambda x : x.c-x.groupby('a').cumcount(ascending=False))
Out[572]: 
   a  b  c  d
0  1  3  6  3
0  1  3  6  4
0  1  3  6  5
0  1  3  6  6
1  2  4  5  4
1  2  4  5  5

Repeating rows of a dataframe based on a column value

Question

1 answers

solution1
1 ACCPTED 2018-10-17 17:57:46

Repeating rows of a dataframe based on a column value

Question

1 answers

solution1 1 ACCPTED 2018-10-17 17:57:46

solution1
1 ACCPTED 2018-10-17 17:57:46