简体   繁体   English

如何通过忽略数字中的句点的标点符号分割字符串

[英]How to split string by punctuation that ignores the period in numbers

I'm using the following code in javascript to split string into phrases.我在 javascript 中使用以下代码将字符串拆分为短语。

var result = str.match( /[^\n\.!\?\;:]+[\n\.!\?\;:]+/g );
let elements = result.map(element => element.trim());
elements = elements.filter(function (el) {return el != null && el != "";});

It works ok.它工作正常。 My problem is when the string has numbers in the thousands marked with a dot that some people use like 1.500.我的问题是,当字符串中的数以千计的数字标有一些人使用的点时,例如 1.500。 How can alter this so that it only separates the phrases if the punctuation is followed by a space.如何更改它以便仅在标点符号后跟空格时分隔短语。

You can use您可以使用

/(?:[^\n.!?;:]|[\n.!?;:](?!\s))+[\n.!?;:]+/g

See the regex demo .请参阅正则表达式演示 The point is that you either match any char other than the punctuation of your choice, or a punctuation not followed with a whitespace, one or more times, and then one or more punctuation symbols of your choice.关键是你要么匹配除你选择的标点符号之外的任何字符,要么匹配一个不带空格的标点符号,一次或多次,然后是你选择的一个或多个标点符号。

Details :详情

  • (?: - start of a non-capturing group: (?: - 非捕获组的开始:
    • [^\n.?;::] - any char but a newline, . [^\n.?;::] - 除换行符以外的任何字符, . , ! , ! , ? , ? , ; , ; or ::
  • | - or - 或者
    • [\n.?;:?](?!\s) - a newline, . [\n.?;:?](?!\s) - 换行符. , ! , ! , ? , ? , ; , ; or : not followed with a whitespace or :后面没有空格
  • )+ - one or more times )+ - 一次或多次
  • [\n.?;::]+ - one or more newline, . [\n.?;::]+ - 一个或多个换行符, . , ! , ! , ? , ? , ; , ; or : chars.:字符。

See a JavaScript demo:请参阅 JavaScript 演示:

 var s = 'It works ok. My problem is when the string has numbers in the thousands marked with a dot that some people use like 1.500. How can alter this so that it only separates the phrases if the punctuation is followed by a space.'; var rx = /(?:[^\n.?;:.]|[\n?;:?.](?;\s))+[\n:;..;]+/g; console.log( s.match(rx) );

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM