简体   繁体   English

js变量的正则表达式以分号结尾

[英]Regex for js variable end with semicolon

I am trying to find and extract an assignment of a property of the product_image object from Javascript code, extracted with BeautifulSoup.我正在尝试从使用 BeautifulSoup 提取的 Javascript 代码中查找并提取product_image对象的属性分配。 I have tried following我试过跟随

re.findall(r"product_images\['top_lg'] = .*;", txt)  

Unfortunately it does not extract anything from my text below.不幸的是,它没有从我下面的文本中提取任何内容。

 product_images['top_lg'] = {
                "tn": '//image.test.com/media/cache/04/0a/040a1e61f5edc387d8c8e40d3ea0e0ca.jpg',
                "md": '//image.test.com/media/cache/b7/f3/b7f3cb1da267d7e8ac0412bdc522c862.jpg',
                "lg": '//image.test.com/media/shape_images/011f7f24ae4cbbef191cff1a711df9e1_a3c9ca71b7d85d87085955f8d1c4bfc3_0_.jpg',
                "alt": 'test ',
                "data-zoomable": 'True',
                "text_line": 'teest'
            };

The scripts that I am parsing are taken from https://www.brilliantearth.com/Petite-Twisted-Vine-Diamond-Ring-White-Gold-BE1D54-3821855/我正在解析的脚本来自https://www.brilliantearth.com/Petite-Twisted-Vine-Diamond-Ring-White-Gold-BE1D54-3821855/

If, like me, you find regex flags confusing and hard to remember, use "not semicolon" expressions instead of dot如果像我一样,您发现正则表达式标志令人困惑且难以记住,请使用“非分号”表达式而不是点

re.findall(r"product_images\\['top_lg'] = [^;]*;", txt)

Note .注意 Otherwise you can add a flag as Thierry suggests, though you would need also add a 'non-gready modifier' ?否则,您可以按照蒂埃里的建议添加一个标志,尽管您还需要添加一个“非成熟修饰符” ? after * to indicate that you are interested in the first semicolon rather that the last. * 后表示您对第一个分号而不是最后一个分号感兴趣。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM