简体   繁体   English

RegEX似乎不匹配,即使它应该匹配

[英]RegEX doesn't seem to match, even though it should

I'm facing trouble trying to match my string W0mMhUcRRnG8dcghE4qvk3JA9lGt8nDl with my RegEX ^([A-Za-z0-9]{32})$ . 我在尝试将我的字符串W0mMhUcRRnG8dcghE4qvk3JA9lGt8nDl与RegEX ^([A-Za-z0-9]{32})$匹配时遇到麻烦。

According to various online RegEx tools it should match, but not according to my Python script: 根据各种在线RegEx工具,它应该匹配,但不符合我的Python脚本:

pattern = re.compile("^([A-Za-z0-9]{32})$")
print(line)
if pattern.match(line):
    return line
else:
    return None

I've attempted using strip() to check if there are any unseen whitespaces, but can't find anything. 我尝试使用strip()检查是否存在任何看不见的空格,但是找不到任何东西。

Here is the entire script: 这是整个脚本:

import requests, binascii, base64, re
from requests.auth import HTTPBasicAuth

def pattern_lookup(line):
    """
    Will iterate through lines
    to find a matching string that
    is 32 characters long and only
    holds alphanumerical characters.
    -----
    :param lines: The lines to be iterated.
    :return: The line holding the matched string,
             or None if not found
    """
    pattern = re.compile("^([A-Za-z0-9]{32})$")
    print(line)
    if pattern.match(line):
        return line
    else:
        return None

def get_secret(host, credentials):
    """
    Grabs the hint(flag) from the 
    host by splitting the response on
    semicolon (;) then performing
    pattern matching using regex.
    ----
    :param host: The host we are sending 
                 requests to.
    :param credentials: The credentials required
                        to sign into the host.
    :return: The hex encoded secret.
    """
    try:
        response = requests.get(host, auth=HTTPBasicAuth(*credentials))
        response_lines = response.content.decode('ascii').replace('"', '').split(';')
        return next((line
                 for line in response_lines
                 if pattern_lookup(line)),
                None)
    except requests.RequestException as e:
        print(e)

def prepare_payload(secret):
    decoded_secret = base64.b64decode(binascii.unhexlify(secret)[::-1])
    payload = {'secret': decoded_secret, 'submit': 'placeholder'}
    return payload

def get_password(host, credentials, secret):
    """
    Uses a post-request injected with the 
    reverse engineered secret to get access
    to the password to natas9.
    :param host: The host that holds the 
                 password.
    :param credentials: 
    :param decoded_hint: 
    :return: The password to Natas9
    """
    payload = prepare_payload(secret)
    try:
        response = requests.post(host, auth=HTTPBasicAuth(*credentials), data=payload)
        response_lines = response.content.decode('utf-8').split(' ')
        return next((line
                     for line in response_lines
                     if pattern_lookup(line.strip())),
                    None)
    except requests.RequestException as e:
        print(e)


def main():
    host = 'http://natas8.natas.labs.overthewire.org/index-source.html'
    credentials = ['natas8', 'DBfUBfqQG69KvJvJ1iAbMoIpwSNQ9bWe']
    secret = get_secret(host, credentials)
    print(get_password(host.split('index')[0], credentials, secret))

if __name__ == '__main__':
    main()

EDIT: 编辑:

I should mention that the initial test in get_secret works absolutely flawlessly and all my previous modules that use this work fine... 我应该提到的是, get_secret中的初始测试可以绝对完美地工作,使用该功能的所有以前的模块都可以正常工作...

EDIT2: 编辑2:

Output: 输出:

<link
rel="stylesheet"
type="text/css"
href="http://natas.labs.overthewire.org/css/level.css">
<link
rel="stylesheet"
href="http://natas.labs.overthewire.org/css/jquery-ui.css"
/>
<link
rel="stylesheet"
href="http://natas.labs.overthewire.org/css/wechall.css"
/>
<script
src="http://natas.labs.overthewire.org/js/jquery-1.9.1.js"></script>
<script
src="http://natas.labs.overthewire.org/js/jquery-ui.js"></script>
<script
src=http://natas.labs.overthewire.org/js/wechall-data.js></script><script
src="http://natas.labs.overthewire.org/js/wechall.js"></script>
<script>var
wechallinfo
=
{
"level":
"natas8",
"pass":
"DBfUBfqQG69KvJvJ1iAbMoIpwSNQ9bWe"
};</script></head>
<body>
<h1>natas8</h1>
<div
id="content">

Access
granted.
The
password
for
natas9
is
W0mMhUcRRnG8dcghE4qvk3JA9lGt8nDl <-- here it is
<form
method=post>
Input
secret:
<input
name=secret><br>
<input
type=submit
name=submit>
</form>

<div
id="viewsource"><a
href="index-source.html">View
sourcecode</a></div>
</div>
</body>
</html>
None

I made a demo code based on your regex, it works fine. 我根据您的正则表达式制作了一个演示代码,它工作正常。

import re
line = 'W0mMhUcRRnG8dcghE4qvk3JA9lGt8nDl'
pattern = re.compile("^([A-Za-z0-9]{32})$")
print(line)
if pattern.match(line):
    print ("matched")
else:
    print ("No")

Demo 演示版

That means the line which you are reading from response_lines is not of the same format which regex is expecting. 这意味着您从response_lines中读取的行与regex期望的格式不同。 Try to print the line and see what's missing. 尝试打印该行,看看缺少了什么。


Edit: After your edit, I can you see have multiline data. 编辑:编辑后,我可以看到有多行数据。 Use the below: 使用以下内容:

pattern = re.compile("^([A-Za-z0-9]{32})$", re.MULTILINE)
if pattern.finditer(line):
    print ("matched")
else:
    print ("No")

Full Demo 完整演示

Your text is multiline. 您的文字是多行的。 Have you tried with: 您是否尝试过:

 re.compile("^([A-Za-z0-9]{32})$", re.MULTILINE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM