[英]pandas - Copy each row 'n' times depending on column value
I'd like to copy or duplicate the rows of a DataFrame based on the value of a column, in this case orig_qty
. 我想根据列的值复制或复制DataFrame的行,在本例中为orig_qty
。 So if I have a DataFrame and using pandas==0.24.2
: 所以,如果我有一个DataFrame并使用pandas==0.24.2
:
import pandas as pd
d = {'a': ['2019-04-08', 4, 115.00], 'b': ['2019-04-09', 2, 103.00]}
df = pd.DataFrame.from_dict(
d,
orient='index',
columns=['date', 'orig_qty', 'price']
)
>>> print(df)
date orig_qty price
a 2019-04-08 4 115.0
b 2019-04-09 2 103.0
So in the example above the row with orig_qty=4
should be duplicated 4 times and the row with orig_qty=2
should be duplicated 2 times. 因此,在上面的示例中, orig_qty=4
的行应重复4次, orig_qty=2
的行应重复2次。 After this transformation I'd like a DataFrame that looks like: 在转换之后,我想要一个看起来像这样的DataFrame:
>>> print(new_df)
date orig_qty price fifo_qty
1 2019-04-08 4 115.0 1
2 2019-04-08 4 115.0 1
3 2019-04-08 4 115.0 1
4 2019-04-08 4 115.0 1
5 2019-04-09 2 103.0 1
6 2019-04-09 2 103.0 1
Note I do not really care about the index after the transformation. 注意转换后我并不关心索引。 I can elaborate more on the use case for this, but essentially I'm doing some FIFO accounting where important changes can occur between values of orig_qty
. 我可以详细说明这个用例,但实际上我正在做一些FIFO会计,其中orig_qty
值之间可能会发生重要的变化。
Use Index.repeat
, DataFrame.loc
, DataFrame.assign
and DataFrame.reset_index
使用Index.repeat
, DataFrame.loc
, DataFrame.assign
和DataFrame.reset_index
new_df = df.loc[df.index.repeat(df['orig_qty'])].assign(fifo_qty=1).reset_index(drop=True)
[output] [输出]
date orig_qty price fifo_qty
0 2019-04-08 4 115.0 1
1 2019-04-08 4 115.0 1
2 2019-04-08 4 115.0 1
3 2019-04-08 4 115.0 1
4 2019-04-09 2 103.0 1
5 2019-04-09 2 103.0 1
使用np.repeat
new_df = pd.DataFrame({col: np.repeat(df[col], df.orig_qty) for col in df.columns})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.