简体   繁体   English

根据 'key=value' 项将 pandas 列拆分为多列

[英]Split pandas column into multiple columns based on 'key=value' items

I have a dataframe where one column contains several information in a 'key=value' format.我有一个 dataframe ,其中一列包含多个“键=值”格式的信息。 There are almost a hundred different 'key=value' that can appear in that column but for simplicity sake I'll use this example with only 4 ( _browser, _status, _city, tag )该列中可以出现近一百个不同的“key=value”,但为了简单起见,我将使用这个示例,只有 4 个( _browser, _status, _city, tag

id  name   properties
0   A      {_browser=Chrome, _status=TRUE, _city=Paris}
1   B      {_browser=null, _status=TRUE, _city=London, tag=XYZ}
2   C      {_status=FALSE, tag=ABC}

How can I convert this splitting the properties string column into multiple columns?如何将这种拆分属性字符串列转换为多列?

The expected output is:预期的 output 为:

id  name   _browser    _status    _city    tag
0   A      Chrome      TRUE       Paris       
1   B      null        TRUE       London   XYZ
2   C                  FALSE               ABC

Note: this value can also contain spaces (eg. _city=Rio de Janeiro )注意:此值也可以包含空格(例如_city=Rio de Janeiro

Let's use str.findall with regex capture groups to extract key-value pairs from the properties column:让我们使用带有正则表达式捕获组的str.findallproperties列中提取键值对:

df.join(pd.DataFrame(
    [dict(l) for l in df.pop('properties').str.findall(r'(\w+)=([^,\}]+)')]))

Result:结果:

 id name _browser _status   _city  tag
  0    A   Chrome    TRUE   Paris  NaN
  1    B     null    TRUE  London  XYZ
  2    C      NaN   FALSE     NaN  ABC

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM