简体   繁体   English

Python - 在文件中查找准确的字符串

[英]Python - Find an exact string in a file

i have to find and extract from a text a series of line of these type:我必须从文本中找到并提取一系列这些类型的行:

  • featureSetCombination: 1特征集组合:1
  • featureSetCombination: 2特征集组合:2
  • ... ...
  • featureSetCombination: 10特征集组合:10
  • featureSetCombination: 11特征集组合:11
  • ... ...
  • featureSetCombination: 100特征集组合:100

Since only the final number changes in the phrases to search for, my idea is to build the phrase progressively by increasing the final value in this way.由于要搜索的短语中只有最终数字会发生变化,因此我的想法是通过以这种方式增加最终值来逐步构建短语。

with open('temp.txt') as file:
for line in file:
    num = 1
    str = 'featureSetCombination    : ' + str(num)
    if xxx
        action
        num += 1

The problem is that I have to search for exactly the string with the number;问题是我必须准确搜索带有数字的字符串; for example the search for "featureSetCombination: 1" would also produce results with "featureSetCombination: 10" or "featureSetCombination: 11" which for what I have to do is not good.例如,搜索“featureSetCombination: 1”也会产生带有“featureSetCombination: 10”“featureSetCombination: 11”的结果,这对我来说是不好的。 I also thought about adding a space after the number to my string, but the idea is not feasible.我也想过在我的字符串的数字后面加一个空格,但是这个想法是不可行的。 The only way is by searching for my string exactly.唯一的方法是准确搜索我的字符串。 Can you help me?你能帮助我吗? Many thanks:)非常感谢:)

you could use regular expression for this, reading the strings an defining the rules for that, in this particular case, either there is a separator or the end of the string so the following code might solve your problem:您可以为此使用正则表达式,读取字符串并为此定义规则,在这种特殊情况下,有分隔符或字符串结尾,因此以下代码可能会解决您的问题:

import re

# Sample string representing the text to search
string = "featureSetCombination: 1 \n featureSetCombination: 10"

re.findall("featureSetCombination:[1-9][$|\s|.|,|;]", string)
>> ['featureSetCombination:1,']

as you can see it finds the first occurrence but not the second如您所见,它找到了第一次出现,但没有找到第二次

Have you looked into string method "find"?您是否查看过字符串方法“find”? Here is a tutorial from W3School.这是来自 W3School 的教程。 It gives useful examples of the syntax for using this method: https://www.w3schools.com/python/ref_string_find.asp它提供了使用此方法的有用语法示例: https://www.w3schools.com/python/ref_string_find.asp

If the sequence is as you listed in your question, the Python "find" method will give you the first result that matches the search criteria.如果序列如您在问题中列出的那样,则 Python “查找”方法将为您提供与搜索条件匹配的第一个结果。 You can end the string with a dot and specify that the end is a dot in the string method to find the exact match.您可以用点结束字符串,并在字符串方法中指定结尾是点以查找精确匹配。 I hope this help!我希望这有帮助!

Alternatively, I would look into Regex for more creative problem-solving solutions.或者,我会研究 Regex 以获得更多创造性的问题解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM