繁体   English   中英

将正则表达式python转换为javascript

[英]Converting regex python to javascript

我对Regex并不陌生,我花了很长时间在javascript中搜索等价物,我很乐意有人用从python转换而来的javascript中的regex进行了详细说明。

import re

regex = r"""
    ^(
      (?P<ShowNameA>.*[^ (_.]) # Show name
        [ (_.]+
        ( # Year with possible Season and Episode
          (?P<ShowYearA>\d{4})
          ([ (_.]+S(?P<SeasonA>\d{1,2})E(?P<EpisodeA>\d{1,2}))?
        | # Season and Episode only
          (?<!\d{4}[ (_.])
          S(?P<SeasonB>\d{1,2})E(?P<EpisodeB>\d{1,2})
        | # Alternate format for episode
          (?P<EpisodeC>\d{3})
        )
    |
      # Show name with no other information
      (?P<ShowNameB>.+)
    )
    """

test_str = ("archer.2009.S04E13\n"
    "space 1999 1975\n"
    "Space: 1999 (1975)\n"
    "Space.1999.1975.S01E01\n"
    "space 1999.(1975)\n"
    "The.4400.204.mkv\n"
    "space 1999 (1975)\n"
    "v.2009.S01E13.the.title.avi\n"
    "Teen.wolf.S04E12.HDTV.x264\n"
    "Se7en\n"
    "Se7en.(1995).avi\n"
    "How to train your dragon 2\n"
    "10,000BC (2010)")

matches = re.finditer(regex, test_str, re.MULTILINE | re.VERBOSE)

for matchNum, match in enumerate(matches):
    matchNum = matchNum + 1

    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))

    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1

        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

Regex101

遗憾的是,没有简单的方法可以将Python regex转换为Javascript regex,因为Python regex比Java regex更加健壮。

Javascript缺少功能性的东西,例如消极的回头和递归,但是它错过了许多语法工具,例如冗长的语法和命名捕获组。

常规捕获组= ()
命名捕获组= (?P<ThisIsAName>)

verbose regex = 'find me #this regex ignores comments and whitespace'
非冗长的正则表达式= 'this treats whitespace literally'

因此,如果我们将您的命名捕获组转换为常规(编号)捕获组
如果我们将详细语法转换为常规语法。 然后,该正则表达式将是有效的Javascript正则表达式,在Javascript中看起来像:
regex = /^((.*[^ (_.])[ (_.]+((\\d{4})([ (_.]+S(\\d{1,2})E(\\d{1,2}))?|(?<!\\d{4}[ (_.])S(\\d{1,2})E(\\d{1,2})|(\\d{3}))|(.+))/

// group 2 = ShowNameA
// group 4 = ShowYearA
// group 6 = SeasonB
// group 7 = EpisodeC
// group 8 = ShowNameB

如您所见,JavaScript版本非常丑陋,因为它没有冗长的语法或命名捕获组。 但是,在这种情况下,功能等效。

Javascript没有与findall直接等效的功能,因此您必须使其与/ find等效。 这是一篇文章,解释几种这样的方法。 https://www.activestate.com/blog/2008/04/javascript-refindall-workalike

将来,我也强烈建议您去regexr.com学习regex,特别是javascript regex。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM