DataFrame Pandas - Flatten column of lists to multiple columns

Question

Here's my problem. I have a dataframe with x columns and y lines. Some columns are actually lists. I want to transform those columns to multiple columns containing single values.

An example speaks by itself :

My dataframe :

            ans_length ans_unigram_numbers  ...  levenshtein_dist  que_entropy
0             [19, 14]             [12, 8]  ...              9.00     3.189898
1                 [19]                [12]  ...              4.00     3.189898
2                  [0]                 [0]  ...            170.00     4.299996
3                  [0]                 [0]  ...            170.00     4.303341
4                  [0]                 [0]  ...            170.00     4.304335
5                  [0]                 [0]  ...            170.00     4.311820
28                [56]                [23]  ...             24.00     4.110291
29                 [0]                 [0]  ...             56.00     4.181720
...                ...                 ...  ...               ...          ...
1976              [24]                [11]  ...             24.00     3.084963
1977              [24]                [11]  ...             24.00     3.084963
1992  [31, 24, 32, 28]    [14, 15, 17, 11]  ...             18.75     3.292770
1993  [31, 24, 32, 28]    [14, 15, 17, 11]  ...             18.75     3.292770

[1998 rows x 9 columns]

What I expect :

    ans_length_0    ans_length_1    ans_length_2    ans_length_3    \
0             19              14            
1             19                
2              0                
3              0                
4              0                
5              0                
28            56                
29             0                
1976          24                
1977          24                
1992          31              24               32             28    
1993          31              24               32             28    

ans_unigram_numbers_0   ans_unigram_numbers_1   ans_unigram_numbers_2   ans_unigram_numbers_3   \
                   12                       8           
                   12               
                   0                
                   0                
                   0                
                   0                
                   23               
                   0                
                   11               
                   11               
                   14                      15                      17                      11   
                   14                      15                      17                      11   

levenshtein_dist    que_entropy
               9       3.189898
               4       3.189898
             170       4.299996
             170       4.303341
             170       4.304335
             170        4.31182
              24       4.110291
              56        4.18172
              24       3.084963
              24       3.084963
            18.75       3.29277
            18.75       3.29277

The newly generated columns should take the name of the old one, adding an index at the end of it.

Answer 1

I think you can use:

cols = ['ans_length','ans_unigram_numbers']

df1 = pd.concat([pd.DataFrame(df[x].values.tolist()).add_prefix(x) for x in cols], axis=1)
df = pd.concat([df1, df.drop(cols, axis=1)], axis=1)

Answer 2

Based on @jezrael answer, I created a function that do what is asked, from a given dataframe and a given list of columns :

def flattencolumns(df1, cols):
    df = pd.concat([pd.DataFrame(df1[x].values.tolist()).add_prefix(x) for x in cols], axis=1)
    return pd.concat([df, df1.drop(cols, axis=1)], axis=1)

DataFrame Pandas - Flatten column of lists to multiple columns

Question

2 answers

solution1
4 ACCPTED 2017-06-29 09:31:27

solution2
0 2017-06-29 09:44:52

DataFrame Pandas - Flatten column of lists to multiple columns

Question

2 answers

solution1 4 ACCPTED 2017-06-29 09:31:27

solution2 0 2017-06-29 09:44:52

solution1
4 ACCPTED 2017-06-29 09:31:27

solution2
0 2017-06-29 09:44:52