简体   繁体   English

关于在JS中用正则表达式分割字符串的问题

[英]Question About splitting a string with Regular Expression in JS

I am learning some regular Expressions.我正在学习一些正则表达式。

Now am I trying to get the street and the house number out of a string.现在我正在尝试从字符串中获取街道和门牌号码。

So this is my string "Driesstraat(Ide) 20".所以这是我的字符串“Driesstraat(Ide) 20”。 Driesstraat is the street name. Driesstraat 是街道名称。 (Ide) is an acronym for the municipality. (Ide) 是自治市的首字母缩写词。 20 is the house number. 20是门牌号。

let re =   /^(\d*[\p{L} \d'\/\\\-\.]+)[,\s]+(\d+)\s*([\p{L} \d\-\/'"\(\)]*)$/iu
    
let match = adresInput.value.match(re)
if (match){
 match.shift();
    
 console.log(match.join('|'));
} 

The above code works when their is no (Ide).上面的代码在它们不存在(Ide)时有效。 I get this string out of a Belgian Eid reader.我从比利时开斋节阅读器那里得到了这个字符串。

thank you in advance先感谢您

First things first, (...) is not matched with your (\d*[\p{L} \d'\/\\\-\.]+) pattern, so you need to add (?:\(\p{L}*\))?首先, (...)与您的(\d*[\p{L} \d'\/\\\-\.]+)模式不匹配,因此您需要添加(?:\(\p{L}*\))? or (?:\([^()]*\))?(?:\([^()]*\))? right after that pattern to optionally match a part of a string between parentheses.在该模式之后可以选择匹配括号之间的字符串的一部分。 (?:\(\p{L}*\))? will match only letters between round brackets and (?:\([^()]*\))?将仅匹配圆括号和(?:\([^()]*\))? will match any chars other than ( and ) between round brackets.将匹配圆括号之间除()以外的任何字符。

Besides, when using regular expressions with /u flag you must make sure you only escape what must be escaped inside character classes.此外,在使用带有/u标志的正则表达式时,您必须确保只转义必须在字符类中转义的内容。 It means, you overescaped several chars inside [...] .这意味着,您在[...]中超出了几个字符。 You need not escape ( , ) , .您无需转义( , ) , . and you can even avoid escaping - when if it is put at the end of the character class.你甚至可以避免 escaping -如果它放在字符 class 的末尾。 The chars that must be escaped inside character classes in ECMAScript regex flavor are \ , ] , and ^ must be escaped if it is at the start of the character class.在 ECMAScript 正则表达式风格的字符类中必须转义的字符是\]^如果它位于字符 class 的开头,则必须对其进行转义。 Although the - can also be escaped it is best practice to put it at the end of the character class unescaped.虽然-也可以转义,但最好将其放在字符 class 的末尾,而不是转义。

So, you can use所以,你可以使用

/^(\d*[\p{L} \d'/\\.-]+)(?:\([^()]*\))?[,\s]+(\d+)\s*([\p{L} \d/'"()-]*)$/u

See the regex demo .请参阅正则表达式演示

Note you do not even have to escape / inside character classes in ECMAScript regex flavor in JavaScript.请注意,您甚至不必在 JavaScript 中的 ECMAScript 正则表达式风格中转义/在字符类中。

See a JavaScript test:请参阅 JavaScript 测试:

 let re = /^(\d*[\p{L} \d'/\\.\-]+)(?:\([^()]*\))?[,\s]+(\d+)\s*([\p{L} \d/'"()-]*)$/u; console.log(re.test("Driesstraat(Ide) 20"));

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM