[英]How can I split a list of comma separated words in a Pandas column?
I was querying Stackoverflow to get some data ( https://data.stackexchange.com/stackoverflow/query/new ), and I have a data frame with Tags as a column. 我正在查询Stackoverflow以获取一些数据( https://data.stackexchange.com/stackoverflow/query/new ),并且我有一个带有标签作为列的数据框。 The tags originally were of the form
标签最初是以下形式
<html><css>
I managed to get them in the form of 我设法以
html,css
I think an image of my Jupyter notebook can display it best: 我认为Jupyter笔记本的图像可以最好地显示它:
How can I separate the tags so that they can become categorical variables, and I can transform them using something like get_dummies? 如何分隔标签,以便它们可以成为分类变量,并可以使用类似get_dummies的方法来对其进行转换? Everything I've seen refers to actual lists, like [html,css], rather than just comma separated words.
我所看到的所有内容都是指实际列表,例如[html,css],而不仅仅是逗号分隔的单词。
为此,我们可以使用df['Tags'].str.get_dummies(',')
,该方法基本上执行split
并将每个元素转换为自己的一键编码列。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.