简体   繁体   English

包含分号作为要在 Python 中搜索的字符串的一部分

[英]Include a semicolon as part of a string to search in Python

So I am slightly new to Python but familiar with other scripting languages.所以我对 Python 有点陌生,但熟悉其他脚本语言。 How do you include a semicolon in a search string with Python correctly.如何在 Python 的搜索字符串中正确包含分号。 Whenever I do, I assume python is interpreting it as a new code block and their for not returning the proper results.每当我这样做时,我假设 python 将其解释为一个新的代码块,并且它们没有返回正确的结果。 See sample below:请参阅下面的示例:

Sample text file:示例文本文件:

<value> I; want; this; line; </value>
<value> And; this; line; </value>
<value> I dont want this line </value>

Code:代码:

import os
import re

find = "<value>*;*"
filename = "C:\\temp\\Sample.txt"

with open (filename, 'r') as infile:
    for line in infile:
        if re.match(find, line):
            print(line)

It is returning all lines rather than just the first and second lines.它返回所有行,而不仅仅是第一行和第二行。 I have tried multiple different methods around this (including this method ) but nothing seams to work.我已经尝试了多种不同的方法(包括这种方法),但没有任何接缝可以工作。 There has to be a simple way to do this, or is Python just really this annoying to work with?必须有一个简单的方法来做到这一点,或者 Python 真的很烦人吗?

It seems like you're confusing regex with another wildcard language (eg globbing ).您似乎将正则表达式与另一种通配符语言(例如globbing )混淆了。 * means zero or more of the preceding expression, not zero or more of anything. *表示零个或多个前面的表达式,而不是零个或多个任何东西。 You need to use .你需要使用. to represent anything.代表任何东西。

find = "<value>.*;.*"

To be clear, the problem doesn't really have anything to do with Python.需要明确的是,这个问题与 Python 没有任何关系。

Check out the Regular Expression HOWTO for more details about using regex.查看正则表达式 HOWTO以获取有关使用正则表达式的更多详细信息。

You're using a wildcard pattern rather than a regexp.您使用的是通配符模式而不是正则表达式。 The regexp <value>*;* matches <value followed by zero or more > followed by zero or more ;正则表达式<value>*;*匹配<value后跟零个或多个>后跟零个或多个; . . Every line matches because they all begin with <value .每行都匹配,因为它们都以<value开头。

The correct regexp is正确的正则表达式是

find = "<value>.*;"

. matches any character, and * means to match any number of them.匹配任意字符, *表示匹配任意数量的字符。 Then it matches ;然后它匹配; . .

I suggest you read the tutorial at www.regular-expression.info.我建议您阅读 www.regular-expression.info 上的教程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM