pandas - 根据列值复制每行'n'次

Question

I'd like to copy or duplicate the rows of a DataFrame based on the value of a column, in this case orig_qty . 我想根据列的值复制或复制DataFrame的行，在本例中为orig_qty 。 So if I have a DataFrame and using pandas==0.24.2 : 所以，如果我有一个DataFrame并使用pandas==0.24.2 ：

import pandas as pd

d = {'a': ['2019-04-08', 4, 115.00], 'b': ['2019-04-09', 2, 103.00]}

df = pd.DataFrame.from_dict(
        d, 
        orient='index', 
        columns=['date', 'orig_qty', 'price']
    )

Input 输入

>>> print(df)
         date  orig_qty   price
a  2019-04-08         4   115.0
b  2019-04-09         2   103.0

So in the example above the row with orig_qty=4 should be duplicated 4 times and the row with orig_qty=2 should be duplicated 2 times. 因此，在上面的示例中， orig_qty=4的行应重复4次， orig_qty=2的行应重复2次。 After this transformation I'd like a DataFrame that looks like: 在转换之后，我想要一个看起来像这样的DataFrame：

Desired Output 期望的输出

>>> print(new_df)
         date  orig_qty  price  fifo_qty
1  2019-04-08         4  115.0         1
2  2019-04-08         4  115.0         1
3  2019-04-08         4  115.0         1
4  2019-04-08         4  115.0         1
5  2019-04-09         2  103.0         1
6  2019-04-09         2  103.0         1

Note I do not really care about the index after the transformation. 注意转换后我并不关心索引。 I can elaborate more on the use case for this, but essentially I'm doing some FIFO accounting where important changes can occur between values of orig_qty . 我可以详细说明这个用例，但实际上我正在做一些FIFO会计，其中orig_qty值之间可能会发生重要的变化。

Answer 1

Use Index.repeat , DataFrame.loc , DataFrame.assign and DataFrame.reset_index 使用Index.repeat ， DataFrame.loc ， DataFrame.assign和DataFrame.reset_index

 new_df = df.loc[df.index.repeat(df['orig_qty'])].assign(fifo_qty=1).reset_index(drop=True)

[output] [输出]

         date  orig_qty  price  fifo_qty
0  2019-04-08         4  115.0         1
1  2019-04-08         4  115.0         1
2  2019-04-08         4  115.0         1
3  2019-04-08         4  115.0         1
4  2019-04-09         2  103.0         1
5  2019-04-09         2  103.0         1

Answer 2

使用np.repeat

new_df = pd.DataFrame({col: np.repeat(df[col], df.orig_qty) for col in df.columns})

pandas - 根据列值复制每行'n'次

问题描述

Input 输入

Desired Output 期望的输出

2 个解决方案

解决方案1
6 已采纳 2019-04-08 16:02:54

解决方案2
2 2019-04-08 16:02:48

pandas - 根据列值复制每行&#39;n&#39;次

问题描述

Input 输入

Desired Output 期望的输出

2 个解决方案

解决方案1 6 已采纳 2019-04-08 16:02:54

解决方案2 2 2019-04-08 16:02:48

pandas - 根据列值复制每行'n'次

解决方案1
6 已采纳 2019-04-08 16:02:54

解决方案2
2 2019-04-08 16:02:48