[英]Repeating items in a data frame using pandas
I have the following dataFrame: 我有以下dataFrame:
id z2 z3 z4
1 2 a fine
2 7 b good
3 9 c delay
4 30 d cold
I am going to generate a data frame by repeating each item in a row twice except items in column z4 (that should not be repeated). 我将通过重复行中的每个项目来生成数据框,除了列z4中的项目(不应重复)。 How I can do it using python and pandas.
我怎么能用python和pandas做到这一点。
The output should be like this: 输出应该是这样的:
id z1 z3 z4
1 2 a fine
1 2 a
1 2 a
2 7 b good
2 7 b
2 7 b
3 9 c delay
3 9 c
3 9 c
4 30 d cold
4 30 d
4 30 d
Another way to do this is to use indexing: Notice that df.iloc[[0, 1, 2, 3]*2, :3]
will give you two copies of the first three columns. 另一种方法是使用索引:请注意,
df.iloc[[0, 1, 2, 3]*2, :3]
将为您提供前三列的两个副本。
This can then be appended to the original df
. 然后可以将其附加到原始
df
。 Remove the NA
. 删除
NA
。 Then sort on index values and reset index (dropping the old index). 然后对索引值进行排序并重置索引(删除旧索引)。 All of which can be chained:
所有这些都可以链接:
df.append(df.iloc[[0, 1, 2, 3]*2, :3]).fillna('').sort_index().reset_index(drop=True)
which produces: 产生:
id z2 z3 z4
0 1 2 a fine
1 1 2 a
2 1 2 a
3 2 7 b good
4 2 7 b
5 2 7 b
6 3 9 c delay
7 3 9 c
8 3 9 c
9 4 30 d cold
10 4 30 d
11 4 30 d
groupby
and apply
will do the trick: groupby
和apply
会做的伎俩:
def func(group):
copy = group.copy()
copy['z4'] = ""
return pd.concat((group, copy, copy))
df.groupby('id').apply(func).reset_index(drop=True)
id z2 z3 z4
0 1 2 a fine
1 1 2 a
2 1 2 a
3 2 7 b good
4 2 7 b
5 2 7 b
6 3 9 c delay
7 3 9 c
8 3 9 c
9 4 30 d cold
10 4 30 d
11 4 30 d
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.