[英]Don't trim tab(\t) from start/end of a String in JAVA
I have an input stream which has fields separated by tab(\\t) which looks like this 我有一个输入流,其中的字段由tab(\\ t)分隔,看起来像这样
String str = " acc123\tdpId123\t2011-01-01\t2022-01-01\thello@xyz.com\tIN\t1233\tSOMETHING ";
which works fine when I do str = str.trim();
当我做
str = str.trim();
时,它工作正常 and 和
strArray = str.split("\t", -1);
strArray=["acc123","dpId123","2011-01-01","2022-01-01","hello@xyz.com","IN","1233","SOMETHING"] will give size as 8
But last field in the input record is not mandatory and can be skipped. 但是输入记录中的最后一个字段不是必填字段,可以跳过。
So the input can look like this too. 因此输入也可以像这样。
String str1 = "acc123\tdpId123\t2011-01-01\t2022-01-01\thello@xyz.com\tIN\t1233\t";
but in this case last field should be empty but when I use this string after trim and split my size is 7 但是在这种情况下,最后一个字段应该为空,但是当我在修剪和分割后使用此字符串时,我的大小是7
str1 = str1.trim();
strArray = str1.split("\t", -1);
strArray=["acc123","dpId123","2011-01-01","2022-01-01","hello@xyz.com","IN","1233"]will give size as 7
But I want 但我想要
strArray=["acc123","dpId123","2011-01-01","2022-01-01","hello@xyz.com","IN","1233",""]
How can I avoid this situation? 如何避免这种情况?
There you go: 你去了:
String str1 = " acc123\tdpId 123\t201 1-01-01\t2022-01-01\thello@xyz.com\tIN\t1233\t";
str1 = str1.replaceAll("^[ ]+", ""); // removing leading spaces
str1 = str1.replaceAll("[ ]+$", ""); // removing trailing spaces
String[] split = str1.split("\t", -1);
System.out.println(Arrays.toString(split));
System.out.println(split.length);
String#trim method also removes \\t
. String#trim方法也会删除
\\t
。 To handle that I have removed only the leading and trailing spaces using regex. 为了解决这个问题,我使用正则表达式仅删除了前导和尾随空格。
Output: 输出:
[acc123, dpId 123, 201 1-01-01, 2022-01-01, hello@xyz.com, IN, 1233, ]
8
You can use split like so : 您可以像这样使用split:
String[] split = str.split("\t", -1); // note the -1
To avoid spaces you can use 为了避免空格,您可以使用
Arrays.stream(split).map(String::trim).toArray(String[]:new);
you can use limit parameter to solve this str.split("\\t",-1)
. 您可以使用limit参数来解决此
str.split("\\t",-1)
。
The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array.
limit参数控制应用图案的次数,因此会影响所得数组的长度。
read more about split limit in the docs . 在docs中阅读有关分割限制的更多信息。
Example: 例:
public class GFG {
public static void main(String args[])
{
String str = "a\tb\tc\t";
String[] arrOfStr = str.split("\t",-1);
for (String a : arrOfStr)
System.out.println(a);
System.out.println(arrOfStr.length);
}
}
The conceptually correct way to do this in your case is to split first , only then trim first and last elements: 在您的情况下,这样做的概念上正确的方法是先拆分 ,然后再修剪第一个和最后一个元素:
String[] array = str.split("\t");
array[0] = array[0].trim();
int last = array.length -1;
if (last > 0) {
array[last] = array[last].trim();
}
Also, if you know upfront how many fields there is supposed to be, then you should also use that knowledge, otherwise you can get an invalid number of fields still: 另外,如果您预先知道应该有多少个字段,那么您也应该使用该知识,否则您仍然可以获得无效数量的字段:
int fieldsCount = getExpectedFieldsCount();
String[] array = str.split("\t", fieldsCount);
Lastly, I advise you to not use whitespace as the data separator. 最后,我建议您不要使用空格作为数据分隔符。 Use something else.
使用其他东西。 For example, see CSV format, it's a lot better for these things.
例如,请参阅CSV格式,这些东西要好得多。
Try this (the result array is in the variable resultArray): 试试看(结果数组在变量resultArray中):
String str1 = "acc123\tdpId123\t2011-01-01\t2022-01-01\thello@xyz.com\tIN\t1233\t";
String[] strArray = str1.split("\t");
String regex = ".*\\t$";
String[] resultArray;
if (str1.matches(regex)) {
resultArray = new String[strArray.length + 1];
resultArray[strArray.length] = "";
} else {
resultArray = new String[strArray.length];
}
for (int i= 0; i < strArray.length; i++) {
resultArray[i] = strArray[i];
}
System.out.println(resultArray.length);
System.out.println(Arrays.toString(resultArray));
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.