简体   繁体   English

将 dataframe 列中的文本值拆分为 python 中的独立列时出错

[英]error while spliting a text value inside a dataframe column into induvial column in python

I have dataframe like below which I want split and create two seperate columns我有 dataframe 如下所示,我想拆分并创建两个单独的列

Mycode我的代码

df['data'] =['cricket:sachin,football:messi,cricket:lara,tennis:nadal,tennis:serina']
df[["L1", "L2"]] = df["data"].str.split(pat=",", expand=True)

Error错误

ValueError: Columns must be same length as key ValueError:列的长度必须与键的长度相同

**Expected Output**

L1        L2
cricket   sachin
football  messi
cricket   lara
tennis    nadal
tennis    serina

How can this be achieved?如何做到这一点?

Try:尝试:

df['data'].str.split(',', expand=True).melt()['value']\
          .str.split(':', expand=True).rename(columns={0:'L1', 1:'L2'})

Output: Output:

         L1      L2
0   cricket  sachin
1  football   messi
2   cricket    lara
3    tennis   nadal
4    tennis  serina

Details:细节:

Split the string on ',' first, with expand=True to get a dataframe, then melt the columns to rows and split the value column on ':' then rename column headers.首先在 ',' 上拆分字符串,使用 expand=True 得到 dataframe,然后将列融合为行并在 ':' 上拆分值列,然后重命名列标题。

You can also do:你也可以这样做:

(df["data"].str.split(',')
           .explode()
           .str.split(':', expand=True)
           .rename(columns={0: 'L1', 1: 'L2'})
).reset_index(drop=True)

Result:结果:

         L1      L2
0   cricket  sachin
1  football   messi
2   cricket    lara
3    tennis   nadal
4    tennis  serina

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM