简体   繁体   English

Java:基于定界符解析字符串

[英]Java: Parsing a string based on delimiter

I have to design an interface where it fetches data from machine and then plots it. 我必须设计一个接口,在该接口中它从机器中获取数据,然后进行绘制。 I have already designed the fetch part and it fetches a string of format A&B@.13409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$ 我已经设计了提取部分,并且它提取的字符串格式为A&B@.13409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$

First five A&B@. 前五个A&B@. characters are the identifier. 字符是标识符。 Please note that the fifth character is new line feed ie ASCII 0xA . 请注意,第五个字符是new line feed即ASCII 0xA

The function I have written - 我写的功能-

   public static boolean checkStart(String str,String startStr){

       String Initials = str.substring(0,5);
       System.out.println("Here is start: " + Initials);       
       if (startStr.equals(Initials))
        return true;
        else
        return false;
     }

shows Here is start: A&B@. 显示Here is start: A&B@. which is correct. 哪个是对的。

Question 1: Why do we need to take str.substring(0,5) ie when I use str.substring(0,4) it shows only - Here is start: A&B@ ie missing new line feed . 问题1:为什么我们需要使用str.substring(0,5)即当我使用str.substring(0,4)它仅显示- Here is start: A&B@即缺少new line feed Why is New Line feed making this difference. 为什么New Line feed造成这种变化。

Further to extract remaing string I have to use s.substring(5,s.length()) instead of s.substring(6,s.length()) 为了进一步提取剩余字符串,我必须使用s.substring(5,s.length())而不是s.substring(6,s.length())

ie s.substring(6,s.length()) produces 3409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$ ie missing the first char after the identifier A&B@. s.substring(6,s.length())产生3409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$即缺少标识符A&B@.之后的第一个字符A&B@.

Question 2: 问题2:

My parsing function is: 我的解析函数是:

public static String[] StringParser(String str,String del){
    String[] sParsed = str.split(del);
     for (int i=0; i<sParsed.length; i++) {
                     System.out.println(sParsed[i]);
              }
    return sParsed;
     }

It parses correctly for String String s = "A&B@.13409/13400/13400/13386/13418/13427/13406/13383/13406/13412/13419/00000/00000/"; 它可以正确解析String String s = "A&B@.13409/13400/13400/13386/13418/13427/13406/13383/13406/13412/13419/00000/00000/"; and calling the function as String[] tokens = StringParser(rightChannelString,"/"); 并以String[] tokens = StringParser(rightChannelString,"/");调用该函数String[] tokens = StringParser(rightChannelString,"/");

But for String such as String s = "A&B@.13409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$" , the call String[] tokens = StringParser(rightChannelString,"$"); 但是对于String,例如String s = "A&B@.13409$13400$13400$13386$13418$13427$13406$13383$13406$13412$13419$00000$00000$" ,则调用String[] tokens = StringParser(rightChannelString,"$"); does not parse the string at all. 根本不分析字符串。

I am not able to figure out why this behaviour. 我无法弄清楚为什么这种行为。 Can any one please let me know the solution? 可以让我知道解决方案吗?

Thanks 谢谢

Regarding question 1, the java API says that the substring method takes 2 parameters: 关于问题1,Java API说substring方法采用2个参数:

  • beginIndex the begin index, inclusive . beginIndex开始索引, 包括
  • endIndex the end index, exclusive . endIndex结束索引, 独占

So in your example 所以在你的例子中

String: A&B@.134
Index:  01234567

substring(0,4) = indexes 0 to 3 so A&B@, that's why you have to put 5 as the second parameter to recover your line delimiter. substring(0,4)=索引0到3,所以A&B @,这就是为什么必须将5作为第二个参数来恢复行定界符的原因。

Regarding question 2, I guess that the split method takes a regexp in parameter and $ is a special character. 关于问题2,我想split方法在参数中使用了regexp,而$是一个特殊字符。 To match the dollar sign I guess you have to escape it with the \\ character (as \\ is a special char in strings so you must also escape it). 为了匹配美元符号,我想您必须使用\\字符对其进行转义(因为\\是字符串中的特殊字符,因此您也必须对其进行转义)。

String[] tokens = StringParser(rightChannelString,"\\$");

Q1: review the description of substring in the documentation: 问题1:查看文档substring的描述:

Returns a new string that is a substring of this string.
The substring begins at the specified beginIndex and extends to the
character at index endIndex - 1. Thus the length of the substring
is endIndex-beginIndex. 

Q2: the split method takes a regular expression for the separator. Q2: split方法使用分隔符的正则表达式。 $ is a special character for regular expressions, it matches the end of the line. $是正则表达式的特殊字符,它与行尾匹配。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM