简体   繁体   English

如何在JavaScript中拆分此字符串?

[英]How can I split this string in JavaScript?

I have strings like this: 我有这样的字符串:

ab
rx'
wq''
pok'''
oyu,
mi,,,,

Basically, I want to split the string into two parts. 基本上,我想将字符串分成两部分。 The first part should have the alphabetical characters intact, the second part should have the non-alphabetical characters. 第一部分应完整保留字母字符,第二部分应具有非字母字符。 The alphabetical part is guaranteed to be 2-3 lowercase characters between a and z; 字母部分保证为a和z之间的2-3个小写字母; the non-alphabetical part can be any length, and is gauranteed to only be the characters , or ' , but not both in the one string (eg eex,', will never occur). 所述非字母部分可以是任何长度,并且gauranteed仅是字符,'但不能同时在一个字符串(例如eex,',永远不会发生)。

So the result should be: 因此结果应为:

[ab][]
[rx][']
[wq]['']
[pok][''']
[oyu][,]
[mi][,,,,]

How can I do this? 我怎样才能做到这一点? I'm guessing a regular expression but I'm not particularly adept at coming up with them. 我正在猜测一个正则表达式,但是我不太擅长提出这些建议。

If you can 100% guarantee that: 如果可以100%保证:

  1. Letter-strings are 2 or 3 characters 字母字符串是2或3个字符
  2. There are always one or more primes/commas 总是有一个或多个素数/逗号
  3. There is never any empty space before, after or in-between the letters and the marks 字母和标记之前,之后或之间绝对没有空白
    (aside from line-break) (除了换行符)

You can use: 您可以使用:

/^([a-zA-Z]{2,3})('+|,+)$/gm

var arr = /^([a-zA-Z]{2,3})('+|,+)$/gm.exec("pok'''");
arr === ["pok'''", "pok", "'''"];

var arr = /^([a-zA-Z]{2,3})('+|,+)$/gm.exec("baf,,,");
arr === ["baf,,,", "baf", ",,,"];

Of course, save yourself some sanity, and save that RegEx as a var. 当然,请保存自己的理智,并将RegEx保存为var。

And as a warning, if you haven't dealt with RegEx like this: If a match isn't found -- if you try to match foo','' by mixing marks, or you have 0-1 or 4+ letters, or 0 marks... ...then instead of getting an array back, you'll get null . 作为警告,如果您还没有像这样处理RegEx:如果找不到匹配项-如果您尝试通过混合标记来匹配foo','' ,或者您有0-1或4+个字母,或0个标记......然后,您将获得null ,而不是返回数组。

So you can do this: 因此,您可以执行以下操作:

var reg = /^([a-zA-Z]{2,3})('+|,+)$/gm,
    string = "foobar'',,''",

    result_array = reg.exec(string) || [string];

In this case, the result of the exec is null; 在这种情况下,exec的结果为null;否则,结果为null。 by putting the || 通过放置|| (or) there, we can return an array that has the original string in it, as index-0. (或)在那里,我们可以返回其中包含原始字符串的数组,作为index-0。

Why? 为什么?

Because the result of a successful exec will have 3 slots; 因为执行成功的结果将有3个位置; [*string*, *letters*, *marks*] . [*string*, *letters*, *marks*] You might be tempted to just read the letters like result_array[1] . 您可能会想阅读诸如result_array[1]类的字母。 But if the match failed and result_array === null , then JavaScript will scream at you for trying null[1] . 但是,如果匹配失败并且result_array === null ,那么JavaScript将为您尖叫尝试null[1]

So returning the array at the end of a failed exec will allow you to get result_array[1] === undefined (ie: there was no match to the pattern, so there are no letters in index-1), rather than a JS error. 因此,在执行失败的末尾返回数组将使您获得result_array[1] === undefined (即:与模式不匹配,因此index-1中没有字母),而不是JS错误。

Regular expressions have is a nice special called "word boundary" ( \\b ). 正则表达式具有一个很好的特殊特性,称为“单词边界”( \\b )。 You can use it, well, to detect the boundary of a word, which is a sequence of alpha-numerical characters. 您可以很好地使用它来检测单词的边界,该边界是字母数字字符序列。

So all you have to do is 所以你要做的就是

foo.split(/\b/)

For example, 例如,

"pok'''".split(/\b/) // ["pok", "'''"]

You could try something like that: 您可以尝试这样的事情:

function splitString(string){
   var match1 = null;
   var match2 = null;
   var stringArray = new Array();
   match1 = string.indexOf(',');
   match2 = string.indexOf('`');
   if(match1 != 0){
      stringArray = [string.slice(0,match1-1),string.slice(match1,string.length-1];
   }
   else if(match2 != 0){
      stringArray = [string.slice(0,match2-1),string.slice(match2,string.length-1];
   }
   else{
      stringArray = [string];
   }

} }

var str = "mi,,,,";
var idx = str.search(/\W/);
if(idx) {
    var list = [str.slice(0, idx), str.slice(idx)]
}

You'll have the parts in list[0] and list[1] . 您将在list[0]list[1]拥有零件。

PS There might be some better ways than this. PS可能有比这更好的方法。

yourStr.match(/(\\ W {2,3})(['] *)/)

if (match = string.match(/^([a-z]{2,3})(,+?$|'+?$)/)) {
    match = match.slice(1);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM