简体   繁体   English

如何从字符串中提取第一组整数?

[英]How to extract first group of integers from a string?

I need to extract the numbers beside # from the string: ackCount.我需要从字符串中提取 # 旁边的数字:ackCount。 I am using indexing, but the number of digits beside hash may increase to 5 or 6..so on.我正在使用索引,但是 hash 旁边的位数可能会增加到 5 或 6..等等。 Can I get only the number immediately after the # (Not 1 that is at the last of the string) Below is the temporary code for getting 4 digits我只能得到#之后的数字吗(不是字符串最后的1)下面是获取4位数字的临时代码

ackCount = "Acknowledgement of  #2352 on component \"lOrA-1\""
OAC = int(re.sub("\\D", "", ackCount)[0:4])
print(OAC)
re.search(r"#(\d+)", ackCount).group(1)

This will search the ackCount string for the first occurrence of a an octothorpe ('#') followed by one or more ( + ) digits ( \d ), capturing only the digit(s) in the capture group ( .group(1) ) of the Match object returned by re.search() .这将在ackCount字符串中搜索第一次出现的 octothorpe ('#') 后跟一个或多个 ( + ) 数字 ( \d ),仅捕获捕获组 ( .group(1)中的数字) 的Match object 由re.search()返回。

In the context of your question, this would become:在您的问题的背景下,这将变为:

ackCount = "Acknowledgement of  #2352 on component \"lOrA-1\""
try:
    OAC = int(re.search(r"#(\d+)", ackCount).group(1))
    print(OAC)
# error handling if the cast to `int` fails, or there is no returned match
except ValueError, AttributeError:
    print("No match found.")
>>> 2352

If the string is always the same you can use ackCount = ackCount.split()如果字符串始终相同,您可以使用ackCount = ackCount.split()

This will return you a list where each element is a word in your original string.这将返回一个列表,其中每个元素都是原始字符串中的一个单词。 By default split() uses blank spaces as the delimiter.默认情况下split()使用空格作为分隔符。

Then get all the digits with ackCount[2][1:] , again assuming it is the same general string but just the digits are different.然后使用ackCount[2][1:]获取所有数字,再次假设它是相同的通用字符串,但只是数字不同。 So index 2 of your list, and then all characters of the string beginning at index 1 (because index 0 of the string is '#').所以你的列表的索引 2,然后是从索引 1 开始的字符串的所有字符(因为字符串的索引 0 是'#')。

You can use regex for this purpose.您可以为此目的使用正则表达式。 Make sure you make the right pattern: The following will return a list of all matches:确保您制作了正确的模式:以下将返回所有匹配项的列表:

import re
string = "he hallo #9090 8080 fdsf sfd222 f222"
find = re.findall("(?<=#)[0-9]+\\b", string)
print(find)

Output: ['9090'] Output: ['9090']

The string string = "he hallo #9090 8080 fdsf sfd222 f222 #888" will return ['9090', '888'] etc.字符串string = "he hallo #9090 8080 fdsf sfd222 f222 #888"将返回['9090', '888']等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM