如何将 Pandas 数组拆分为列？

Question

I'm trying to split array values to columns.我正在尝试将数组值拆分为列。

I've created a Google Colab notebook and you can find my code here .我创建了一个 Google Colab 笔记本，您可以在此处找到我的代码。

Here is a screenshot of the data (Hashtags):这是数据的屏幕截图（Hashtags）：

Here is a representation of the data.这是数据的表示。

    codes
1   [71020]
2   [77085]
3   [36415]
4   [99213, 99287]
5   [99233, 99233, 99233]

I want to split this arrays into different columns.我想将此数组拆分为不同的列。

To something like this (screenshot - Hashtags split to columns):对于这样的事情（屏幕截图 - 主题标签拆分为列）：

Here is a representation of it.这是它的一个表示。

                   code_1      code_2      code_3   
1                  71020
2                  77085
3                  36415
4                  99213       99287
5                  99233       99233       99233

I tried the following code which I got form this Stack Overflow post , but it doesn't give the expected results:我尝试了从这个Stack Overflow 帖子中获得的以下代码，但它没有给出预期的结果：

df_hashtags_splitted = pd.DataFrame(df['hashtags'].tolist())

What am I doing wrong?我究竟做错了什么？

Answer 1

The reason is the lists are still stored as strings in the hashtags column when you read them with read_csv .原因是当您使用read_csv hashtags中。 You can convert them upon reading of the data (follwing code taken from the Colab notebook):您可以在读取数据时转换它们（以下代码取自 Colab 笔记本）：

import pandas as pd
from ast import literal_eval

url = "https://raw.githubusercontent.com/hashimputhiyakath/datasets/main/hashtags10.csv"

# Notice the added converter to turn strings into lists.
df = pd.read_csv(url, converters={'hashtags': literal_eval})

And then the solution you mentioned will work as expected.然后您提到的解决方案将按预期工作。

df_hashtags_splitted = pd.DataFrame(df['hashtags'].tolist(), index=df.index).add_prefix('hashtag_')
print(df_hashtags_splitted.head(10))

          hashtag_0     hashtag_1         hashtag_2       hashtag_3           hashtag_4       hashtag_5    hashtag_6         hashtag_7  hashtag_8       hashtag_9 hashtag_10 hashtag_11
0         longcovid     covidhelp              None            None                None            None         None              None       None            None       None       None
1            mumbai         covid      hospitalbeds  covidemergency           mahacovid       oxygenbed  mumbaicovid  covid19indiahelp  covidhelp  covidresources       None       None
2   kawahcoffeeshop   coffeelover             kawah       costarica            puravida         heredia       oxygen              None       None            None       None       None
3           lucknow        mumbai         hyderabad           delhi            verified  covidresources    covidhelp  covid19indiahelp       None            None       None       None
4            oxygen          None              None            None                None            None         None              None       None            None       None       None
5  covid19indiahelp        mahara              None            None                None            None         None              None       None            None       None       None
6            oxygen       amadoda              None            None                None            None         None              None       None            None       None       None
7  plasmadonordelhi  plasmamumbai  covid19indiahelp       covidhelp  covidemergency2021            None         None              None       None            None       None       None
8            oxygen  conservation           wilding       rewilding         environment  sustainability  restorative       agriculture   wildlife    biodiversity      water   wildswim
9             covid      verified            mumbai          oxygen  covidemergency2021         covid19    covidhelp    covidresources       None            None       None       None

Alternatively, to convert the lists to strings after you read the csv you can do:或者，要在阅读 csv 后将列表转换为字符串，您可以执行以下操作：

df['hashtags'] = df['hashtags'].map(literal_eval)

如何将 Pandas 数组拆分为列？

问题描述

1 个解决方案

解决方案1
1 2022-05-22 05:36:42

如何将 Pandas 数组拆分为列？

问题描述

1 个解决方案

解决方案1 1 2022-05-22 05:36:42

解决方案1
1 2022-05-22 05:36:42