简体   繁体   English

正则表达式包含负面印象

[英]Regex include the negative lookbehind

I'm trying to filter a string before passing it through eval in python. 我正在尝试在通过python中的eval之前过滤字符串。 I want to limit it to math functions, but I'm not sure how to strip it with regex. 我想将其限制为数学函数,但是我不确定如何用正则表达式剥离它。 Consider the following: 考虑以下:

s = 'math.pi * 8'

I want that to basically translate to 'math.pi*8', stripped of spaces. 我希望基本上将其转换为'math.pi * 8',并删除空格。 I also want to strip any letters [A-Za-z] that are not followed by math\\. 我也想删除所有不带math\\.字母[A-Za-z] math\\. .

So if s = 'while(1): print "hello"' , I want any executable part of it to be stripped: 因此,如果s = 'while(1): print "hello"' ,我希望剥离其中的任何可执行部分:

s would ideally equal something like ():"" in that scenario (all letters gone, because they were not followed by math\\. . 理想情况下,s等于():""类的东西(所有字母都消失了,因为它们后面没有math\\.

Here's the regex I've tried: 这是我尝试过的正则表达式:

(?<!math\.)[A-Za-z\s]+

and the python: 和python:

re.sub(r'(?<!math\.)[A-Za-z\s]+', r'', 'math.pi * 8')

But the result is '.p*8' , because math. 但是结果是'.p*8' ,因为math. is not followed by math. 后面没有math. , and i is not followed by math. ,而i没有math. .

How can I strip letters that are not in math and are not followed by math. 我如何去除不在math中且未跟随math.字母math. ?

What I ended up doing 我最终做了什么

I followed @Thomas's answer, but also stripped square brackets, spaces, and underscores from the string, in hopes that no python function can be executed other than through the math module: 我遵循了@Thomas的回答,但也删除了字符串中的方括号,空格和下划线,希望除了通过math模块之外,无法执行任何python函数:

s = re.sub(r'(\[.*?\]|\s+|_)', '', s)
s = eval(s, {
    '__builtins__' : None,
    'math' : math
    })

As @Carl says in a comment, look at what lybniz does for something better. 就像@Carl在评论中说的那样,请看lybniz可以做什么以取得更好的效果。 But even this is not enough! 但是,这还不够!

The technique described at the link is the following: 链接中描述的技术如下:

print eval(raw_input(), {"__builtins__":None}, {'pi':math.pi})

But this doesn't prevent something like 但这并不能阻止类似

([x for x in 1.0.__class__.__base__.__subclasses__()
   if x.__name__ == 'catch_warnings'][0]()
   )._module.__builtins__['__import__']('os').system('echo hi!')

Source: Several of Ned Batchelder's posts on sandboxing, see http://nedbatchelder.com/blog/201302/looking_for_python_3_builtins.html 来源:Ned Batchelder关于沙箱的几篇文章,请参见http://nedbatchelder.com/blog/201302/looking_for_python_3_builtins.html

edit: pointed out that we don't get square brackets or spaces, so: 编辑:指出我们没有方括号或空格,所以:

1.0.__class__.__base__.__subclasses__().__getitem__(i)()._module.__builtins__.get('__import__')('os').system('echo hi')

where you just try a lot of values for i. 在这里,您只是为i尝试了很多价值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM