[英]Split pandas column into multiple columns based on 'key=value' items
I have a dataframe where one column contains several information in a 'key=value' format.我有一个 dataframe ,其中一列包含多个“键=值”格式的信息。 There are almost a hundred different 'key=value' that can appear in that column but for simplicity sake I'll use this example with only 4 (
_browser, _status, _city, tag
)该列中可以出现近一百个不同的“key=value”,但为了简单起见,我将使用这个示例,只有 4 个(
_browser, _status, _city, tag
)
id name properties
0 A {_browser=Chrome, _status=TRUE, _city=Paris}
1 B {_browser=null, _status=TRUE, _city=London, tag=XYZ}
2 C {_status=FALSE, tag=ABC}
How can I convert this splitting the properties string column into multiple columns?如何将这种拆分属性字符串列转换为多列?
The expected output is:预期的 output 为:
id name _browser _status _city tag
0 A Chrome TRUE Paris
1 B null TRUE London XYZ
2 C FALSE ABC
Note: this value can also contain spaces (eg. _city=Rio de Janeiro
)注意:此值也可以包含空格(例如
_city=Rio de Janeiro
)
Let's use str.findall
with regex capture groups to extract key-value pairs from the properties
column:让我们使用带有正则表达式捕获组的
str.findall
从properties
列中提取键值对:
df.join(pd.DataFrame(
[dict(l) for l in df.pop('properties').str.findall(r'(\w+)=([^,\}]+)')]))
Result:结果:
id name _browser _status _city tag
0 A Chrome TRUE Paris NaN
1 B null TRUE London XYZ
2 C NaN FALSE NaN ABC
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.