繁体   English   中英

在python熊猫中合并两个大小不同的数据帧

[英]merge Two data frames of different size in python pandas

我有这样的数据框:df1:

Id   name Checksum
2001  A   e882
2002  B   2884
2002  C   ee12,ee84
2003  D   ee23
2004  E   ee42,ee43
2006  F   2884,2993,3884,3855
2006  G   344,122,288

我想要这样的输出:

Id   name Checksum
2001  A   e882
2002  B   2884
2002  C   ee12
2002  C   ee84
2003  D   ee23
2004  E   ee42
2004  E   ee43
2006  F   2884
2006  F   2993
2006  F   3884
2006  F   3855
2006  G   344
2006  G   122
2006  G   288

我想像上面一样创建新的数据框

我如何在python熊猫中做到这一点?

您可以将str.split用于list ,然后将len作为length

最后创建新的DataFrameconstructornumpy.repeatnumpy.concatenate

s = df['Checksum'].str.split(',')
print (s)
0                      [e882]
1                      [2884]
2                [ee12, ee84]
3                      [ee23]
4                [ee42, ee43]
5    [2884, 2993, 3884, 3855]
6             [344, 122, 288]
Name: Checksum, dtype: object

l = s.str.len()
print (l)
0    1
1    1
2    2
3    1
4    2
5    4
6    3
Name: Checksum, dtype: int64

cols = ['Id','name']
df = pd.DataFrame({x : np.repeat(df[x].values, l) for x in cols})
df['Checksum'] = np.concatenate(s)
df = df.reindex_axis(df.columns, axis=1)
print (df)

      Id name Checksum
0   2001    A     e882
1   2002    B     2884
2   2002    C     ee12
3   2002    C     ee84
4   2003    D     ee23
5   2004    E     ee42
6   2004    E     ee43
7   2006    F     2884
8   2006    F     2993
9   2006    F     3884
10  2006    F     3855
11  2006    G      344
12  2006    G      122
13  2006    G      288

chain.from_iterable交替使用:

from  itertools import chain

s = df['Checksum'].str.split(',')
l = s.str.len()
cols = ['Id','name']
df = pd.DataFrame({x : np.repeat(df[x].values, l) for x in cols})
df['Checksum'] = list(chain.from_iterable(s))
df = df.reindex_axis(df.columns, axis=1)
print (df)
         Id name Checksum
0   2001    A     e882
1   2002    B     2884
2   2002    C     ee12
3   2002    C     ee84
4   2003    D     ee23
5   2004    E     ee42
6   2004    E     ee43
7   2006    F     2884
8   2006    F     2993
9   2006    F     3884
10  2006    F     3855
11  2006    G      344
12  2006    G      122
13  2006    G      288

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM