简体   繁体   中英

Split list in panda dataframe columns and rows

I wrote a little crawler for a website and obtained a list in the following structure:

'DRAFT ACT: OPEN\nSome Information \nTopic\nJustice\nType\nImplementing\nPeriod\n12.11.2020 - 10.12.2020', 'DRAFT ACT: OPEN\Some other Information\nTopic\nJustice\nType\nImplementing\nPeriod\n12.11.2020 - 10.12.2020,...

Now I would like to seperate this text list into a pandas dataframe dividing columns by \n and rows by , . Unfortunately, I don't know how to approach his. Could someone please help me? Is there an easy way to split this list using pandas or another package?

The result should thus look like this:

     Column1          Column2                Column3 Column4  Column5 Columns6     Column7  Column8
Row1 DRAFT ACT: OPEN  Some Information       Topic   Justice  Type    Implementing Period   12.11.2020 - 10.12.2020'
Row2 DRAFT ACT: OPEN  Some other Information Topic   Justice  Type    Implementing Period   12.11.2020 - 10.12.2020'

Thank you very much in advance!

lets say you get a list of strings like this.

list1=['DRAFT ACT: OPEN\nSome Information \nTopic\nJustice\nType\nImplementing\nPeriod\n12.11.2020 - 10.12.2020', 'DRAFT ACT: OPEN\nSome other Information\nTopic\nJustice\nType\nImplementing\nPeriod\n12.11.2020 - 10.12.2020']

you can iterate the list and split each item on \n

like:

list1=[x.split('\n') for x in list1]

or like:

for idx,item in enumerate(list1):
    list1[idx]=item.split('\n')

now you can create a dataframe with list1 .

import pandas as pd
df=pd.DataFrame(list1,columns=['Column1','Column2','Column3','Column4','Column5','Column6','Column7','Column8'])

import pandas as pd x = "'DRAFT ACT: OPEN\nSome Information \nTopic\nJustice\nType\nImplementing\nPeriod\n12.11.2020 - 10.12.2020', 'DRAFT ACT: OPEN\nSome other Information\nTopic\nJustice\nType\nImplementing\nPeriod\n12.11.2020 - 10.12.2020'" x = x.replace("\n","_") x = x.replace(",","\n") x = x.replace("_",",") with open("output.csv", 'w') as file: file.write(x) with open('output.csv','r') as file: z = pd.read_csv(file) print(z, type(z))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM