I have a pandas data frame similar to table A and I will like to get table B. What will be the easiest way to do this using pandas?
Thanks
table A(ColofInt has varying length of string to parse out):
ColA ColB ColofInt ColD
A B StrA;StrB;StrC; 1
A B StrD;StrB;StrC;StrD; 3
A B StrC;StrB; 2
A B StrB; 5
table B:
ColA ColB ColofInt1 ColofInt2 ColofInt2 ColofInt3 ColD
A B StrA StrB StrC 1
A B StrD StrB StrC StrD 3
A B StrC StrB 2
A B StrB 5
Assuming a file 'tableA.csv' containing the following:
ColA,ColB,ColofInt,ColD
A,B,StrA;StrB;StrC;,1
A,B,StrD;StrB;StrC;StrD;,3
A,B,StrC;StrB;,2
A,B,StrB;,5
Then:
import pandas as pd
tableA= pd.read_csv('tableA.csv')
This generates a dataframe with your new columns
data_aux = pd.DataFrame(list(tableA.ColofInt.str.split(';').apply(lambda x: x[:-1])))
cols = []
for e in data_aux .columns:
cols.append('ColofInt' + str(e+1))
data_aux .columns = cols
Heres 'data_aux':
ColofInt1 ColofInt2 ColofInt3 ColofInt4
0 StrA StrB StrC None
1 StrD StrB StrC StrD
2 StrC StrB None None
3 StrB None None None
And this joins the dataframes, dropping the original column.
tableB = pd.concat([tableA,data_aux],axis=1).drop('ColofInt',axis=1)
Here's the resulting 'tableB':
ColA ColB ColD ColofInt1 ColofInt2 ColofInt3 ColofInt4
0 A B 1 StrA StrB StrC None
1 A B 3 StrD StrB StrC StrD
2 A B 2 StrC StrB None None
3 A B 5 StrB None None None
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.