[英]Separate sentence with special characters into words including spaces
I want to separate a sentence with special characters into words keeping the spaces . 我想将带有特殊字符的句子分隔成保留空格的单词。 Like so: 像这样:
"la sílaba tónica es la penúltima".split(...regex...)
to: 至:
["la ", "sílaba ", "tónica ", "es ", "la ", "penúltima"]
↑ ↑ ↑ ↑
space space space space
I've tried with a modified version of this answer: https://stackoverflow.com/a/26184632/2083117 我尝试使用此答案的修改版本: https : //stackoverflow.com/a/26184632/2083117
With the code from that answer: 使用该答案中的代码:
"la sílaba tónica es la penúltima".split(/\b(?![\s.])/)
Result: 结果:
["la ", "s", "í", "laba ", "t", "ó", "nica ", "es ", "la ", "pen", "ú", "ltima"]
↑ ↑ ↑
Those special characters shouldn't split the word. 那些特殊字符不应该分开。
My version simply adding the special characters I want to keep ( .áéíóúñ,:;?
): 我的版本仅添加了我想保留的特殊字符( .áéíóúñ,:;?
):
"la sílaba tónica es la penúltima".split(/\b(?![\s.áéíóúñ,:;?])/)
Result: 结果:
["la ", "sí", "laba ", "tó", "nica ", "es ", "la ", "penú", "ltima"]
↑ ↑ ↑
Now the characters are included but the word is braking after them. 现在包括了字符,但单词紧跟其后。
What would be the right regular expression for this? 什么是正确的正则表达式呢?
Try to match \\S+\\s*
instead of split. 尝试匹配\\S+\\s*
而不是拆分。
var result = "la sílaba tónica es la penúltima".match(/\\S+\\s*/gi); console.log(result);
let splitArray = "la sílaba tónica es la penúltima".split(" ") let splitArrayWithSpaces = splitArray.map((item, index ) => { if(index!== splitArray.length-1) return (item+ " ") else return item }) console.log(splitArrayWithSpaces)
This az\\xC0-\\xff
selects chars and diacritics. 此az\\xC0-\\xff
选择字符和变音符号。 I split it by /[^az\\xC0-\\xff]/
. 我用/[^az\\xC0-\\xff]/
。 Then I add the space. 然后我添加空间。
Alternatively you can split by /[\\s]/
另外,您也可以按/[\\s]/
let test = "la sílaba tónica es la penúltima".split(/[^az\\xC0-\\xff]/) for(let i=0; i < test.length; i++){test[i]+= " ";} console.log(test)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.