[英]Repeat elements of a DataFrame for each unique element in a given column
My question is quite straightforward and there is probably a really simple way to solve but has been taking too much of my patience when trying to find a solution.我的问题很简单,可能有一种非常简单的方法可以解决,但是在尝试找到解决方案时,我已经花费了太多的耐心。
I have the following data that I made up just to illustrate:我有以下数据,我只是为了说明而编造的:
x1 = ['a','b','c']
x2 = [1,2,3,4]
x3 = ['y1','y2','y3','y4']
What I want to do is for each unique element of the first column, I want to repeat the rest of that dataframe for that specific unique value.我想要做的是对于第一列的每个唯一元素,我想为该特定唯一值重复该数据帧的其余部分。 Hence, obtaining the following:因此,获得以下内容:
0 1 2
0 a 1 y1
1 a 2 y2
2 a 3 y3
3 a 4 y4
4 b 1 y1
5 b 2 y2
6 b 3 y3
7 b 4 y4
8 c 1 y1
9 c 2 y2
10 c 3 y3
11 c 4 y4
Any ideas how to overcome that?任何想法如何克服它?
Use itertools.product
with zipped columns:使用带有压缩列的itertools.product
:
from itertools import product
df = pd.DataFrame([(a,b,c) for a, (b, c) in product(x1, zip(x2, x3))])
print (df)
0 1 2
0 a 1 y1
1 a 2 y2
2 a 3 y3
3 a 4 y4
4 b 1 y1
5 b 2 y2
6 b 3 y3
7 b 4 y4
8 c 1 y1
9 c 2 y2
10 c 3 y3
11 c 4 y4
If input are DataFrames use cross join
:如果输入是 DataFrames 使用cross join
:
df1 = pd.DataFrame({'a':x1})
df2 = pd.DataFrame({'b':x2, 'c':x3})
df = df1.assign(val=1).merge(df2.assign(val=1), on='val').drop('val', axis=1)
print (df)
a b c
0 a 1 y1
1 a 2 y2
2 a 3 y3
3 a 4 y4
4 b 1 y1
5 b 2 y2
6 b 3 y3
7 b 4 y4
8 c 1 y1
9 c 2 y2
10 c 3 y3
11 c 4 y4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.