JS Regex - 查找包含特殊字符的子字符串

Question

Can you please help me to understand how to do the following?您能帮我了解如何执行以下操作吗？

I'm having a strings (3 formats for this string):我有一个字符串（此字符串的 3 种格式）：

"Section_1: hello & goodbye | section_2: kuku" “Section_1：你好和再见 | section_2：kuku”
"Section_1: hello & goodbye & hola | section_2: kuku" “第 1 部分：你好、再见、你好 | 第 2 部分：库库”
"Section_1: hello | section_2: kuku" “第 1 节：你好 | 第 2 节：库库”

I want the get the result:我想要得到结果：

Group section_1: "hello & goodbye", Group section_2: "kuku"组section_1：“你好，再见”，组section_2：“kuku”
Group section_1: "hello & goodbye & hola", Group section_2: "kuku"组section_1：“你好，再见，你好”，组section_2：“kuku”
Group section_1: "hello", Group section_2: "kuku"组section_1：“你好”，组section_2：“kuku”

Now I have the regex (but it's not working for me because of the '&'):现在我有了正则表达式（但它对我不起作用，因为'&'）：

Section_1:\s*(?<section_1>\w+)(\s*\|\s*(Section_2:(\s*(?<section_2>.*))?)?)?

Note: the regex is capturing 2 groups- "section_1" and "section_2"注意：正则表达式正在捕获 2 个组-“section_1”和“section_2”

The question is- how can I read sub string the can contains zero or more from " & {word}"问题是 - 我如何从“＆{word}”中读取包含零个或多个的子字符串

Thanks in advance提前致谢

Answer 1

As per the comments we established that the ' & '- combination acts as a delimiter between words.根据评论，我们确定 ' & '- 组合充当单词之间的分隔符。 There are probably a ton of ways to write a pattern to capture these substrings, but to me these can be grouped into extensive or simple.可能有很多方法可以编写模式来捕获这些子字符串，但对我来说，这些可以分为广泛的或简单的。 Depending if you need to validate the input more thoroughly you could use:根据您是否需要更彻底地验证输入，您可以使用：

^section_1:\s*(?<section_1>[a-z]+(?:\s&\s[a-z]+)*)\s*\|\s*section_2:\s*(?<section_2>[a-z]+(?:\s&\s[a-z]+)*)$

See an online demo .查看在线演示。 The pattern means:图案的意思是：

^ - Start-line anchor; ^ - 起跑线锚；
section_1:\s* - Match 'Section_1:' literally followed by 0+ whitespace characters; section_1:\s* - 匹配 'Section_1:' 后跟 0+ 个空白字符；
(?<section_1>[az]+(?:\s+&\s[az]+)*) - A named capture group to catch [az]+ as 1+ upper/lower letters (case-insensitive flag), followed by a nested non-capture group matching 0+ times the pattern (?:\s&\s[az]+)* to test for any delimiter as per above followed by another word; (?<section_1>[az]+(?:\s+&\s[az]+)*) - 一个命名捕获组，将[az]+捕获为 1+ 个大写/小写字母（不区分大小写的标志），后跟一个嵌套的非捕获组，匹配 0+ 倍模式(?:\s&\s[az]+)*以测试上述任何分隔符，然后是另一个单词；
\s*\|\s*section_2:\s* - Match whitespace characters, a literal pipe-symbol and literally 'Section_2:' upto; \s*\|\s*section_2:\s* - 匹配空白字符、文字管道符号和字面上的“Section_2:”；
(?<section_2>[az]+(?:\s&\s[az]+)*) - A 2nd named capture group to match the same pattern as the above named capture group; (?<section_2>[az]+(?:\s&\s[az]+)*) - 第二个命名的捕获组，与上述命名的捕获组匹配相同的模式；
$ - End-line anchor. $ - 结束线锚。

Note : As mentioned, there are a ton of differnt pattern one could use depending on how specific you need to be about validating input.注意：如前所述，可以使用大量不同的模式，具体取决于您对验证输入的具体要求。 For example: \s*(?<section_1>[^:|]+?)\s*\|\s*[^:]*:\s*(?<section_2>.+) may also work.例如： \s*(?<section_1>[^:|]+?)\s*\|\s*[^:]*:\s*(?<section_2>.+)也可以工作。

JS Regex - 查找包含特殊字符的子字符串

问题描述

1 个解决方案

解决方案1
2 已采纳 2022-07-04 12:13:34

JS Regex - 查找包含特殊字符的子字符串

问题描述

1 个解决方案

解决方案1 2 已采纳 2022-07-04 12:13:34

解决方案1
2 已采纳 2022-07-04 12:13:34