[英]How to split a string in python by certain characters?
I am trying to solve a problem with prefix notation, but I am stuck on the part, where I want to split my string into an array: If I have the input +22 2
I want to get the array to look like this: ['+', '22', '2']
I tried using the我正在尝试解决前缀表示法的问题,但我被困在我想将字符串拆分为数组的部分:如果我有输入
+22 2
我想让数组看起来像这样: ['+', '22', '2']
我尝试使用
import re
function, but I am not sure how it works. function,但我不确定它是如何工作的。 I tried the
我试过了
word.split(' ')
method, but it only helps with the spaces.. any ideas?方法,但它只对空间有帮助..有什么想法吗? PS: In the prefix notation I will also have + - and *.
PS:在前缀符号中,我还将有 + - 和 *。 So I need to split the string so the space is not in the array, and +, -, * is in the array I am thinking of
所以我需要拆分字符串,所以空间不在数组中,并且+,-,*在我正在考虑的数组中
word = input()
array = word.split(' ')
Then after that I am thinking of splitting a string by these 3 characters.然后,我想用这 3 个字符分割一个字符串。
Sample input: '+-12 23*67 1'
样本输入:
'+-12 23*67 1'
Output: ['+', '-', '12', '23', '*', '67', '1']
Output:
['+', '-', '12', '23', '*', '67', '1']
You can use re
to find patterns in text, it seems you are looking for either one of these: +
, -
and *
or a digit group.您可以使用
re
在文本中查找模式,您似乎正在寻找以下其中之一: +
, -
和*
或数字组。 So compile a pattern that looks for that and find all that match this pattern and you will get a list:所以编译一个寻找它的模式并找到所有匹配这个模式的东西,你会得到一个列表:
import re
pattern = re.compile(r'([-+*]|\d+)')
string = '+-12 23*67 1'
array = pattern.findall(string)
print(array)
# Output:
# ['+', '-', '12', '23', '*', '67', '1']
Also a bit of testing (comparing your sample strings with the expected output):还有一些测试(将您的示例字符串与预期输出进行比较):
test_cases = {
'+-12 23*67 1': ['+', '-', '12', '23', '*', '67', '1'],
'+22 2': ['+', '22', '2']
}
for string, correct in test_cases.items():
assert pattern.findall(string) == correct
print('Tests completed successfully!')
Pattern explanation (you can read about this in the docs linked below):模式解释(您可以在下面链接的文档中阅读相关内容):
r'([-+*]|\d+)'
r
in front to make it a raw string so that Python interprets all the characters literally, this helps with escape sequences in the regex pattern because you can escape them with one backslash r
在前面使其成为原始字符串,以便 Python 从字面上解释所有字符,这有助于正则表达式模式中的转义序列,因为您可以使用一个反斜杠对其进行转义(...)
parentheses around (they are not necessary in this case) indicate a group which can later be retrieved if needed (but in this case they don't matter much) (...)
括号(在这种情况下它们不是必需的)表示稍后可以在需要时检索的组(但在这种情况下它们并不重要)
[...]
indicates that any single character from this group can be matched so it will match if any of -
, +
and *
will be present [...]
表示该组中的任何单个字符都可以匹配,因此如果存在-
, +
和*
中的任何一个,它将匹配|
logical or
, meaning that can match either side (to differentiate between numbers and special characters in this case)逻辑
or
,表示可以匹配任一侧(在这种情况下区分数字和特殊字符)
\d
special escape sequence for digits, meaning to match any digit, the +
there indicates matching any one or more digits \d
数字的特殊转义序列,表示匹配任何数字, +
表示匹配任何一个或多个数字
Useful:有用:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.