简体   繁体   中英

How Best to Unpack a Pandas Dataframe of Tuples?

Probably really straightforward but I'm having no luck with Google. I have a 2 column dataframe of tuples, and I'm looking to unpack each tuple then pair up the contents from the same position in each column. For example:

Col1     Col2
(a,b,c)  (d,e,f)

my desired output is

a d
b e
c f

I have a solution using loops but I would like to know a better way to do it - firstly because I am trying to eradicate loops from my life and secondly because it's potentially not as flexible as I may need it to be.

l1=[('a','b'),('c','d'),('e','f','g'),('h','i')]
l2=[('j','k'),('l','m'),('n','o','p'),('q','r')]

df  = pd.DataFrame(list(zip(l1,l2)),columns=['Col1','Col2'])

df
Out[547]: 
        Col1       Col2
0     (a, b)     (j, k)
1     (c, d)     (l, m)
2  (e, f, g)  (n, o, p)
3     (h, i)     (q, r)

for i in range(len(df)):
    for j in range(len(df.iloc[i][1])):
            print(df.iloc[i][0][j], df.iloc[i][1][j])
    
a j
b k
c l
d m
e n
f o
g p
h q
i r

All pythonic suggestions and guidance hugely appreciated. Many thanks.

Addition: an example including a row with differing length tuples, per Ch3steR's request below - my loop would not work in this instance ('d2' would not be included, where I would want it to be outputted paired with a null).

l1=[('a','b'),('c','d','d2'),('e','f','g'),('h','i')]
l2=[('j','k'),('l','m'),('n','o','p'),('q','r')]

df  = pd.DataFrame(list(zip(l1,l2)),columns=['Col1','Col2'])

Send each Series tolist and then reconstruct the DataFrame and stack . Then concat back together. This will leave you with a MultiIndex with the first level being the original DataFrame index and the second level being the position in the tuple.

This will work for older versions of pandas pd.__version__ < '1.3.0' and for instances where the tuples have an unequal number of elements (where explode will fail)

import pandas as pd

df1 = pd.concat([pd.DataFrame(df[col].tolist()).stack().rename(col) 
                 for col in df.columns], axis=1)

    Col1 Col2
0 0    a    j
  1    b    k
1 0    c    l
  1    d    m
2 0    e    n
  1    f    o
  2    g    p
3 0    h    q
  1    i    r

if the tuples length are always matching and you don't have the newer version of pandas to pass a list columns to explode , do something like this:

import pandas as pd
pd.concat([df.Col1.explode(), df.Col2.explode()], axis=1).reset_index(drop=True)

  Col1 Col2
0    a    j
1    b    k
2    c    l
3    d    m
4    e    n
5    f    o
6    g    p
7    h    q
8    i    r

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM