简体   繁体   中英

Making a dataframe where new row is created after every nth column using only semi colons as delimiters

I have the following string in a column within a row in a pandas dataframe. You could just treat it as a string.

;2;613;12;1;Ajc hw EEE;13;.387639;1;EXP;13;2;128;12;1;NNN XX Ajc;13;.208966;1;SGX;13;..

It goes on like that.

I want to convert it into a table and use the semi colon ; symbol as a delimiter. The problem is there is no new line delimiter and I have to estimate it to be every 10 items.

So, it should look something like this.

;2;613;12;1;Ajc hw EEE;13;.387639;1;EXP;13;
 2;128;12;1;NNN XX Ajc;13;.208966;1;SGX;13;..

How do I convert that string into a new dataframe in pandas. After every 10 semi colon delimiters, a new row should be created.

I have no idea how to do this, any help would be greatly appreciated in terms of tools or ideas.

This should work

# removing first value as it's a semi colon
data = ';2;613;12;1;Ajc hw EEE;13;.387639;1;EXP;13;2;128;12;1;NNN XX Ajc;13;.208966;1;SGX;13;'[1:] 
data = data.split(';')
row_count = len(data)//10

data = [data[x*10:(x+1)*10] for x in range(row_count)]
pd.DataFrame(data)

I used a double slash for dividing but as your data length should be dividable by 10, you can use only one.

Here's a screenshot of my output. 在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM