简体   繁体   中英

Python:how to split column into multiple columns in a dataframe and with dynamic column naming

i have a sample dataset

id           value
[10,10]     ["apple","orange"]  
[15,67]      ["banana","orange"] 
[12,34,45]   ["apple","banana","orange"] 

i want to convert this into

id1 id2 id3            value1 value2 value3
10  10  nan           apple  orange   nan
15  67  nan           banana orange   nan
10  10  45            apple  banana  orange
  • i solved this problem earlier using if else conditions
  • but data could be dynamic so it may have more then 3 values.
  • How to split into multiple column with renaming it as mentioned

We can reconstruct your data with tolist and pd.DataFrame . Then concat everything together again:

d = [pd.DataFrame(df[col].tolist()).add_prefix(col) for col in df.columns]
df = pd.concat(d, axis=1)

   id0  id1   id2  value0  value1  value2
0   10   10   NaN   apple  orange    None
1   15   67   NaN  banana  orange    None
2   12   34  45.0   apple  banana  orange

Try this code.

df = pd.DataFrame({"id":[[10, 10], [15, 67], [12, 34, 45]],
                   "value":[['a', 'o'], ['b', 'o'], ['a', 'b', 'o']]})
    
output = pd.DataFrame()
for col in df.columns:
    output = pd.concat([output,
                       pd.DataFrame(df[col].tolist(), columns = [col + str(i+1) for i in range(df[col].apply(len).max())])],
                        axis = 1)

Key code is pd.DataFrame(df[col].tolist(), columns = [col + str(i+1) for i in range(df[col].apply(len).max())])] .

Here, df[col].apply(len).max() is maximum number of elements among lists in a column. df[col].tolist() converts df[col] into nested list, and remake it as DataFrame.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM