简体   繁体   English

如何为列表中的每个元素创建不同的列?

[英]How to make different columns for each elements in a list?

I have a pandas dataframe column that contains list of strings (lengths are different) like below: df['category'] :我有一个 pandas dataframe 列,其中包含字符串列表(长度不同),如下所示: df['category']

category                                                                           | ...
---------
['Grocery & Gourmet Food', 'Cooking & Baking', 'Lard & Shortening', 'Shortening']  | ...
['Grocery & Gourmet Food', 'Candy & Chocolate', 'Mints']                           | ...
['Grocery & Gourmet Food', 'Soups, Stocks & Broths', 'Broths', 'Chicken']          | ...

Now, I want to break this category column into different columns for each string element in the list.现在,我想为列表中的每个字符串元素将此类别列分成不同的列。 Is it possible to do using pandas?可以使用 pandas 吗? How I am gonna handle the column names?我将如何处理列名?

I have gone through the answers of this question , but the difference is my list lengths are not the same always.我已经完成了这个问题的答案,但不同的是我的列表长度并不总是相同。

My expected output would be something like below:我预期的 output 将如下所示:

category_1             | category_2       |  category_n  | other_columns 
------------------------------------------------------------------
Grocery & Gourmet Food | Cooking & Baking | Lard & Shortening | ...
...                    | ...              | ...               | ...

I would do something like this:我会做这样的事情:

df2 = pd.DataFrame(df['category'].to_list(), columns=[f"category_{i+1}" for i in range(len(df['category'].max()))])
df = pd.concat([df.drop('category', axis=1), df2], axis=1)

Output: Output:

               category_1              category_2         category_3  \
0  Grocery & Gourmet Food        Cooking & Baking  Lard & Shortening   
1  Grocery & Gourmet Food       Candy & Chocolate              Mints   
2  Grocery & Gourmet Food  Soups, Stocks & Broths             Broths   

   category_4  
0  Shortening  
1        None  
2     Chicken 

Edit:编辑:

As @mozway suggested, it is better to create the columns with their default names and then update them:正如@mozway建议的那样,最好使用默认名称创建列,然后更新它们:

df2 = pd.DataFrame(df['category'].to_list())
df2.columns = df2.columns.map(lambda x: f'category_{x+1}')
df = pd.concat([df.drop('category', axis=1), df2], axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM