简体   繁体   English

尝试使用String.split(“ \\\\?”)时发生意外行为

[英]Unexpected behaviour when trying to use String.split(“\\?”)

So I have a string that is like this: 所以我有一个像这样的字符串:

"Some text here?Some number here"

and I need to split those, I am using String.split("\\\\?") , but if I have a string like this: 并且我需要拆分它们,我正在使用String.split("\\\\?") ,但是如果我有一个像这样的字符串:

"This is a string with, comma?1234567"

I have it splitted in the comma( , ) too. 我有它在分裂逗号( , )太。 And if I have this String: 如果我有这个字符串:

"That´s a problem here?123456"

It also splits on ´ , So how can I fix this? 它还对分裂´ ,那么,如何解决这一问题?

I am not seeing this behaviour: (nor would I expect to) 我没有看到这种行为:(我也不会期望)

String s ="hello?1000";

String[] fields = s.split("\\?");

for (String field : fields) {
   System.out.println(field);
}

yields: 产量:

hello 你好

1000 1000

Introducing a comma "," or an apostrophe "'" doesn't make any difference to the split: 引入逗号“,”或撇号“'”对拆分没有任何影响:

String s ="he,llo?1000";

yields: 产量:

he,llo 你好

1000 1000

String s ="he'llo?1000";

yields: 产量:

he'llo 你好

1000 1000

The spilt also works fine if you have any spaces in your input string. 如果输入字符串中有空格,则溢出也可以正常工作。 I can only suggest that your regex is not what you think it is! 我只能建议您的正则表达式不是您想的那样!

this is the solution: (EDIT: it's even simpler) 这是解决方案:(编辑:它更简单)

public static Pair<String,String> getSplittedByQuestionMark(String term){
    String[] list=term.split("[?]");
    return new Pair<String,String>(list[0],list[1]);
}

i tested it: 我测试了它:

@Test
public void testGetSplittedByQuestionMark(){
    ArrayList<String> terms=new ArrayList<String>();
    ArrayList<Pair<String,String>> expected=new ArrayList<Pair<String,String>>();
    terms.add("test?a");
    terms.add("test?20");
    terms.add("test, with comma?ab10");
    expected.add(new Pair<String,String>("test","a"));
    expected.add(new Pair<String,String>("test","20"));
    expected.add(new Pair<String,String>("test, with comma","ab10"));
    for(int i=0;i<terms.size();i++){
        Pair<String,String> answer = StringStandardRegex.getSplittedByQuestionMark(terms.get(i));
        assertTrue("answer="+answer.getFirst(),answer.getFirst().equals(expected.get(i).getFirst()));
        assertTrue("answer="+answer.getSecond(),answer.getSecond().equals(expected.get(i).getSecond()));
    }

}

[EDIT after remark below] I have added a test, Now I don;t see what's the problem, this works as well (and is even more simpel): [下面的评论后编辑]我添加了一个测试,现在我不明白是什么问题,它也可以正常工作(甚至更简单):

@Test
public void testGetSplittedByQuestionMarkNotUsingRegex(){
    ArrayList<String> terms=new ArrayList<String>();
    ArrayList<Pair<String,String>> expected=new ArrayList<Pair<String,String>>();
    terms.add("test?a");
    terms.add("test?20");
    terms.add("test, with comma?ab10");
    expected.add(new Pair<String,String>("test","a"));
    expected.add(new Pair<String,String>("test","20"));
    expected.add(new Pair<String,String>("test, with comma","ab10"));
    for(int i=0;i<terms.size();i++){
        String[] answer=terms.get(i).split("\\?");
        assertTrue("answer="+answer[0],answer[0].equals(expected.get(i).getFirst()));
        assertTrue("answer="+answer[1],answer[1].equals(expected.get(i).getSecond()));
    }

}

Looks like a typical regex problem. 看起来像一个典型的正则表达式问题。 I am using this for example to split 我以这个为例

name (code)

into a pair with the name and the code separate: 成对,名称和代码分开:

RE regex = new RE("(.*) \\W(.*)\\W");
if(!regex.match(term)){
    throw new InvalidArgumentException("the given term does not match the regelar expression:'NAME (ID)'");
}
Pair<String,String> pair=new Pair<String,String>(regex.getParen(1),regex.getParen(2));
return pair;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM