简体   繁体   English

python正则表达式不匹配文件内容与 re.match 和 re.MULTILINE 标志

[英]python regular expression not matching file contents with re.match and re.MULTILINE flag

I'm reading in a file and storing its contents as a multiline string.我正在读取文件并将其内容存储为多行字符串。 Then I loop through some values I get from a django query to run regexes based on the query results values.然后我遍历我从 django 查询中获得的一些值,以根据查询结果值运行正则表达式。 My regex seems like it should be working, and works if I copy the values returned by the query, but for some reason isn't matching when all the parts are working together that ends like this我的正则表达式似乎应该可以工作,并且如果我复制查询返回的值就可以工作,但是由于某些原因,当所有部分都以这样结束的方式一起工作时不匹配

My code is:我的代码是:

with open("/path_to_my_file") as myfile:
    data=myfile.read()

#read saved settings then write/overwrite them into the config
items = MyModel.objects.filter(some_id="s100009")
for item in items:
    regexString = "^\s*"+item.feature_key+":"

    print regexString #to verify its what I want it to be, ie debug
    pq = re.compile(regexString, re.M)

    if pq.match(data):
        #do stuff

So basically my problem is that the regex isn't matching.所以基本上我的问题是正则表达式不匹配。 When I copy the file contents into a big old string, and copy the value(s) printed by the print regexString line, it does match, so I'm thinking theres some esoteric python/django thing going on (or maybe not so esoteric as python isnt my first language).当我将文件内容复制到一个大的旧字符串中,并复制由print regexStringprint regexString ,它确实匹配,所以我想有一些深奥的 python/django 事情正在发生(或者可能不是那么深奥因为 python 不是我的第一语言)。

And for examples sake, the output of print regexString is :例如, print regexString的输出是:

^\\s*productDetailOn:

File contents:文件内容:

    productDetailOn:true,
    allOff:false,
    trendingWidgetOn:true,
    trendingWallOn:true,
    searchResultOn:false,
    bannersOn:true,
    homeWidgetOn:true,
}

Running Python 2.7.运行 Python 2.7。 Also, dumped the types of both item.feature and data, and both were unicode.另外,转储了 item.feature 和 data 的类型,并且都是 unicode。 Not sure if that matters?不确定这是否重要? Anyway, I'm starting to hit my head off the desk after working this for a couple hours, so any help is appreciated.无论如何,在工作了几个小时后,我的头开始从办公桌上掉下来,因此感谢您的帮助。 Cheers!干杯!

According to documentation, re.match never allows searching at the beginning of a line :根据文档, re.match永远不允许在一行的开头搜索:

Note that even in MULTILINE mode, re.match() will only match at the beginning of the string and not at the beginning of each line.请注意,即使在MULTILINE模式下, re.match()也只会匹配字符串的开头,而不是每行的开头。

You need to use a re.search :您需要使用re.search

regexString = r"^\s*"+item.feature_key+":"
pq = re.compile(regexString, re.M)
if pq.search(data):

A small note on the raw string ( r"^\\s+" ): in this case, it is equivalent to "\\s+" because there is no \\s escape sequence (like \\r or \\n ), thus, Python treats it as a raw string literal.关于原始字符串( r"^\\s+" )的一个小说明:在这种情况下,它等效于"\\s+"因为没有\\s转义序列(如\\r\\n ),因此,Python 将其处理作为原始字符串文字。 Still, it is safer to always declare regex patterns with raw string literals in Python (and with corresponding notations in other languages, too).尽管如此,始终使用 Python 中的原始字符串文字(以及其他语言中的相应符号)声明正则表达式模式更安全。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM