[英]Separate column values by backslash pandas
I have a dataframe like this:我有一个这样的数据框:
data = {'id': [1,1,1,2,2],
'value': ['red','red\blue','yellow','oak','oak\wood']
}
df = pd.DataFrame (data, columns = ['id','value'])
What I want is:我想要的是:
id value count
1 red 2
1 blue 1
1 yellow 1
2 oak 2
2 wood 1
If it's other delimiters like ;
如果是其他分隔符,如;
and /
i can do:和/
我可以做到:
df1 = (df.assign(value = df['value'].str.split(';|/'))
.explode('value')
.groupby(['id','value'], sort=False)
.size()
.reset_index(name='count'))
But when it's backslash \\
it doesn't work.但是当它是反斜杠\\
它不起作用。
What should I do?我该怎么办?
You can replace all non-alphanumeric characters from your value and then do a split您可以从您的值中替换所有非字母数字字符,然后进行拆分
df1 = (df.assign(value = df['value'].replace({r'\W': ' '}, regex=True).str.split())
.explode('value')
.groupby(['id','value'], sort=False)
.size()
.reset_index(name='count'))
NOTE : This will fail if there are other symbols that are not needed for value split.注意:如果存在不需要进行值拆分的其他符号,这将失败。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.