简体   繁体   中英

How to split one string column with regular format to multiple columns in Pandas

If I have a dataframe like this:

id  s             scene
1   a   kitchen: 0.297, living: 0.515, degree: A
2   b   kitchen: 0.401, study: 0.005, degree: A
3   c   study: 0.913, degree: B
4   d   living: 0.515, degree: B
5   e   others: 0.1, degree: C

How can get a new dataframe as follows with Pandas.

So far, I have tried df[['id', 's', 'kitchen', 'living', 'study', 'others', 'degree']] = df['scene'].str.split(',', expand=True) :

id   s   kitchen living  study   others    degree
1    a    0.297   0.515   0       0           A 
2    b    0.401   0       0.005   0           A
3    c    0       0       0.913   0           B
4    d    0       0.515   0       0           B
5    e    0       0       0       0.1         C

You can

In [763]: dff = pd.DataFrame(
              dict(y.split(': ') for y in x.split(', ')) for x in df.scene).fillna(0)

In [764]: dff
Out[764]:
  degree kitchen living others  study
0      A   0.297  0.515      0      0
1      A   0.401      0      0  0.005
2      B       0      0      0  0.913
3      B       0  0.515      0      0
4      C       0      0    0.1      0

then join

In [766]: df.join(dff)
Out[766]:
   id  s                                     scene degree kitchen living  \
0   1  a  kitchen: 0.297, living: 0.515, degree: A      A   0.297  0.515
1   2  b   kitchen: 0.401, study: 0.005, degree: A      A   0.401      0
2   3  c                   study: 0.913, degree: B      B       0      0
3   4  d                  living: 0.515, degree: B      B       0  0.515
4   5  e                    others: 0.1, degree: C      C       0      0

  others  study
0      0      0
1      0  0.005
2      0  0.913
3      0      0
4    0.1      0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM