简体   繁体   English

如何将 Pandas 数组拆分为列?

[英]How can I split Pandas arrays into columns?

I'm trying to split array values to columns.我正在尝试将数组值拆分为列。

I've created a Google Colab notebook and you can find my code here .我创建了一个 Google Colab 笔记本,您可以在此处找到我的代码。

Here is a screenshot of the data (Hashtags):这是数据的屏幕截图(Hashtags):

Here is a representation of the data.这是数据的表示。

    codes
1   [71020]
2   [77085]
3   [36415]
4   [99213, 99287]
5   [99233, 99233, 99233]

I want to split this arrays into different columns.我想将此数组拆分为不同的列。

To something like this (screenshot - Hashtags split to columns):对于这样的事情(屏幕截图 - 主题标签拆分为列):

Here is a representation of it.这是它的一个表示。

                   code_1      code_2      code_3   
1                  71020
2                  77085
3                  36415
4                  99213       99287
5                  99233       99233       99233

I tried the following code which I got form this Stack Overflow post , but it doesn't give the expected results:我尝试了从这个Stack Overflow 帖子中获得的以下代码,但它没有给出预期的结果:

df_hashtags_splitted = pd.DataFrame(df['hashtags'].tolist())

What am I doing wrong?我究竟做错了什么?

The reason is the lists are still stored as strings in the hashtags column when you read them with read_csv .原因是当您使用read_csv hashtags中。 You can convert them upon reading of the data (follwing code taken from the Colab notebook):您可以在读取数据时转换它们(以下代码取自 Colab 笔记本):

import pandas as pd
from ast import literal_eval

url = "https://raw.githubusercontent.com/hashimputhiyakath/datasets/main/hashtags10.csv"

# Notice the added converter to turn strings into lists.
df = pd.read_csv(url, converters={'hashtags': literal_eval})

And then the solution you mentioned will work as expected.然后您提到的解决方案将按预期工作。

df_hashtags_splitted = pd.DataFrame(df['hashtags'].tolist(), index=df.index).add_prefix('hashtag_')
print(df_hashtags_splitted.head(10))
          hashtag_0     hashtag_1         hashtag_2       hashtag_3           hashtag_4       hashtag_5    hashtag_6         hashtag_7  hashtag_8       hashtag_9 hashtag_10 hashtag_11
0         longcovid     covidhelp              None            None                None            None         None              None       None            None       None       None
1            mumbai         covid      hospitalbeds  covidemergency           mahacovid       oxygenbed  mumbaicovid  covid19indiahelp  covidhelp  covidresources       None       None
2   kawahcoffeeshop   coffeelover             kawah       costarica            puravida         heredia       oxygen              None       None            None       None       None
3           lucknow        mumbai         hyderabad           delhi            verified  covidresources    covidhelp  covid19indiahelp       None            None       None       None
4            oxygen          None              None            None                None            None         None              None       None            None       None       None
5  covid19indiahelp        mahara              None            None                None            None         None              None       None            None       None       None
6            oxygen       amadoda              None            None                None            None         None              None       None            None       None       None
7  plasmadonordelhi  plasmamumbai  covid19indiahelp       covidhelp  covidemergency2021            None         None              None       None            None       None       None
8            oxygen  conservation           wilding       rewilding         environment  sustainability  restorative       agriculture   wildlife    biodiversity      water   wildswim
9             covid      verified            mumbai          oxygen  covidemergency2021         covid19    covidhelp    covidresources       None            None       None       None

Alternatively, to convert the lists to strings after you read the csv you can do:或者,要在阅读 csv 后将列表转换为字符串,您可以执行以下操作:

df['hashtags'] = df['hashtags'].map(literal_eval)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM