简体   繁体   中英

Unpack lists with different numbers of tuples into a DataFrame

Now I have a Series that groups cases by customer ID.

Index
(Cust-1,2) [(Case-11,Open),(Case-12,Closed)]
(Cust-2,3) [(Case-21,Open),(Case-22,Closed),(Case-23,Open)]

And the expected output will look like this.

Cust ID Count Case ID Case Status Case ID Case Status Case ID Case Status
Cust-1 2 Case-11 Open Case-12 Closed
Cust-2 3 Case-21 Open Case-22 Closed Case-23 Open

Try this:

#example of dataframe
df = pd.DataFrame({
    'col1' : ['(Cust-1,2)', '(Cust-2,3)'],
    'col2' : ['[(Case-11,Open),(Case-12,Closed)]',
              '[(Case-21,Open),(Case-22,Closed),
                (Case-23,Open)]']})

a = df['col1'].str.split(",", 2, expand=True).replace(to_replace = "[,(\)\[\]]", 
                                                  value="", regex=True)
b = df['col2'].str.split(",", 5, expand=True).replace(to_replace = "[,(\)\[\]]", 
                                                  value="", regex=True)
cols = ['Cust ID', 'Count', 'Case ID', 'Case Status', 
        'Case ID', 'Case Status', 'Case ID', 'Case Status']
new_df = pd.concat([a,b], axis =1)
new_df.columns = cols

result

  Cust ID Count  Case ID Case Status  Case ID Case Status  Case ID Case Status
0  Cust-1     2  Case-11        Open  Case-12      Closed     None        None
1  Cust-2     3  Case-21        Open  Case-22      Closed  Case-23        Open

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM