简体   繁体   English

如何在Python中将字符串拆分为单词和特殊字符?

[英]How do You Split String into Words and Special Characters in Python?

I want to split a string into words [a-zA-Z] and any special character that it may contain except @ and # symbols 我想将字符串拆分为单词[a-zA-Z]和除@#符号外可能包含的任何特殊字符

message = "I am to be @split, into #words, And any other thing that is not word, mostly special character(.,>)"

Expected Result: 预期结果:

['I', 'am', 'to', 'be', '@split', ',', 'into', '#words', ',', 'And', 'any', 'other', 'thing', 'that', 'is', 'not', 'word', ',', 'mostly', 'special', 'character', '(', '.', ',', '>', ')']

How can I achieve this in Python? 如何在Python中实现?

How about: 怎么样:

re.findall(r"[A-Za-z@#]+|\S", message)

The pattern matches any sequence of word characters (here, defined as letters plus @ and # ), or any single non-whitespace character. 该模式匹配单词字符的任何序列(此处定义为字母加@# ),或任何单个非空白字符。

You can use a character class to specify all of the characters you don't want for the split. 您可以使用字符类来指定不需要分割的所有字符。 [^\\w@#] -- this means every character except letters/numbers/underscore/@/# [^\\w@#] -表示除字母/数字/下划线/ @ /#之外的所有字符

Then you can capture the special characters as well using capturing parentheses in re.split . 然后,您也可以使用re.split括号捕获特殊字符。

filter(None, re.split(r'\s|([^\w@#])', message))

The filter is done to remove empty strings from splitting between special characters. 进行filter是为了除去空字符串,避免在特殊字符之间进行拆分。 The \\s| \\s| part is so that spaces are not captured. 部分是为了不捕获空间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 基于某些单词拆分字符串并删除 Python 中的某些特殊字符 - Split string based on certain words and remove certain special characters in Python 如何在 Python 字符串中对特殊字符进行转义? - How Do You Unescape Special Characters In A Python String? Python/Java:如何反转字符串单词而不是特殊字符 - Python/Java: How to Reverse a String Words but not special characters 如何按空格分割字符串并将特殊字符视为Python中的单独单词? - How to split string by space and treat special characters as a separate word in Python? 根据特殊字符分割字符串python - split string based on special characters python python中基于特殊字符的拆分字符串 - split string based on special characters in python 如何在Python中向文件中编写特殊字符(“\\ n”,“\\ b”,...)? - How do you write special characters (“\n”,“\b”,…) to a file in Python? 如何使用 Python 计算文本文档中的唯一单词(没有特殊字符/大小写干扰) - How can you use Python to count the unique words (without special characters/ cases interfering) in a text document 如何将字符串拆分为python中不包含空格的单词? - How to split string into words that do not contain whitespaces in python? Python 中带有特殊字符的字符串显示不正确 - String with special characters in Python do not appear correctly
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM