[英]How to extract first group of integers from a string?
I need to extract the numbers beside # from the string: ackCount.我需要从字符串中提取 # 旁边的数字:ackCount。 I am using indexing, but the number of digits beside hash may increase to 5 or 6..so on.
我正在使用索引,但是 hash 旁边的位数可能会增加到 5 或 6..等等。 Can I get only the number immediately after the # (Not 1 that is at the last of the string) Below is the temporary code for getting 4 digits
我只能得到#之后的数字吗(不是字符串最后的1)下面是获取4位数字的临时代码
ackCount = "Acknowledgement of #2352 on component \"lOrA-1\""
OAC = int(re.sub("\\D", "", ackCount)[0:4])
print(OAC)
re.search(r"#(\d+)", ackCount).group(1)
This will search the ackCount
string for the first occurrence of a an octothorpe ('#') followed by one or more ( +
) digits ( \d
), capturing only the digit(s) in the capture group ( .group(1)
) of the Match
object returned by re.search()
.这将在
ackCount
字符串中搜索第一次出现的 octothorpe ('#') 后跟一个或多个 ( +
) 数字 ( \d
),仅捕获捕获组 ( .group(1)
中的数字) 的Match
object 由re.search()
返回。
In the context of your question, this would become:在您的问题的背景下,这将变为:
ackCount = "Acknowledgement of #2352 on component \"lOrA-1\""
try:
OAC = int(re.search(r"#(\d+)", ackCount).group(1))
print(OAC)
# error handling if the cast to `int` fails, or there is no returned match
except ValueError, AttributeError:
print("No match found.")
>>> 2352
If the string is always the same you can use ackCount = ackCount.split()
如果字符串始终相同,您可以使用
ackCount = ackCount.split()
This will return you a list where each element is a word in your original string.这将返回一个列表,其中每个元素都是原始字符串中的一个单词。 By default
split()
uses blank spaces as the delimiter.默认情况下
split()
使用空格作为分隔符。
Then get all the digits with ackCount[2][1:]
, again assuming it is the same general string but just the digits are different.然后使用
ackCount[2][1:]
获取所有数字,再次假设它是相同的通用字符串,但只是数字不同。 So index 2 of your list, and then all characters of the string beginning at index 1 (because index 0 of the string is '#').所以你的列表的索引 2,然后是从索引 1 开始的字符串的所有字符(因为字符串的索引 0 是'#')。
You can use regex for this purpose.您可以为此目的使用正则表达式。 Make sure you make the right pattern: The following will return a list of all matches:
确保您制作了正确的模式:以下将返回所有匹配项的列表:
import re
string = "he hallo #9090 8080 fdsf sfd222 f222"
find = re.findall("(?<=#)[0-9]+\\b", string)
print(find)
Output: ['9090']
Output:
['9090']
The string string = "he hallo #9090 8080 fdsf sfd222 f222 #888"
will return ['9090', '888']
etc.字符串
string = "he hallo #9090 8080 fdsf sfd222 f222 #888"
将返回['9090', '888']
等。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.