简体   繁体   English

使用正则表达式在 JavaScript 中拆分字符串时遇到问题

[英]Trouble splitting a string in JavaScript using regular expressions

I am VERY new to regex in js and having a very hard time manipulating them to do what I am looking for.我对 js 中的正则表达式非常陌生,并且很难操纵它们来做我正在寻找的事情。

I have a series of strings that I am trying to strip of unusual characters, spaces, newlines, etc. and put them into arrays where each entry is a word consisting of only alphanumeric characters.我有一系列字符串,我试图去除不寻常的字符、空格、换行符等,并将它们放入 arrays 中,其中每个条目都是一个仅由字母数字字符组成的单词。

For example:例如:

 let testString = "this*is " + "a\n" +" test string " test = testString.split(/\W/) console.log(test)

Yields [ 'this', 'is', 'a', '', 'test', 'string', '' ]产量[ 'this', 'is', 'a', '', 'test', 'string', '' ]

But ideally I would like it to yield [ 'this', 'is', 'a', 'test', 'string']但理想情况下,我希望它产生[ 'this', 'is', 'a', 'test', 'string']

I can achieve the desired result by adding .filter(word => word !== '') to the end, but I am wondering if there is a way to do this using only regular expressions?我可以通过在末尾添加.filter(word => word !== '')来获得所需的结果,但我想知道是否有办法只使用正则表达式来做到这一点? Also would it be necessary to add a global flag to the regex?还有必要在正则表达式中添加一个全局标志吗?

Thanks in advance for any input!提前感谢您的任何意见!

Just a simple one-liner:只是一个简单的单行:

function getWords  = s => ( s ?? "" ).match( /\w+/g );

Then...然后...

const words = getWords( 'this, that & the other;etc.';

yields产量

[ 'this', 'that', 'the', 'other', 'etc' ]

Use the trim function to remove additional space around the sentence.使用trim function 删除句子周围的额外空间。 More Information: https://www.w3schools.com/jsref/jsref_trim_string.asp更多信息: https://www.w3schools.com/jsref/jsref_trim_string.asp

test = testString.trim().split(/\W/);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM