简体   繁体   English

用Java解析逗号分隔的文本

[英]Parsing comma delimited text in Java

If I have an ArrayList that has lines of data that could look like: 如果我有一个ArrayList,其数据行可能如下所示:

bob,      jones,    123-333-1111
    james, lee,  234-333-2222

How do I delete the extra whitespace and get the same data back? 如何删除额外的空格并获取相同的数据? I thought you could maybe spit the string by "," and then use trim(), but I didn't know what the syntax of that would be or how to implement that, assuming that is an ok way to do it because I'd want to put each field in an array. 我以为你可以通过“,”吐出字符串,然后使用trim(),但我不知道它的语法是什么或如何实现它,假设这是一个好的方法,因为我' d想要将每个字段放在一个数组中。 So in this case have a [2][3] array, and then put it back in the ArrayList after removing the whitespace. 所以在这种情况下有一个[2] [3]数组,然后在删除空格后将它放回ArrayList中。 But that seems like a funny way to do it, and not scaleable if my list changed, like having an email on the end. 但这似乎是一种有趣的方式,如果我的列表发生了变化,那就不可扩展了,比如最后收到一封电子邮件。 Any thoughts? 有什么想法吗? Thanks. 谢谢。

Edit: Dumber question, so I'm still not sure how I can process the data, because I can't do this right: 编辑:Dumber问题,所以我仍然不确定如何处理数据,因为我不能这样做:

for (String s : myList) {
    String st[] = s.split(",\\s*");
}

since st[] will lose scope after the foreach loop. 因为st []将在foreach循环后失去范围。 And if I declare String st[] beforehand, I wouldn't know how big to create my array right? 如果我事先声明String st [],我不知道创建我的数组有多大? Thanks. 谢谢。

You could just scan through the entire string and build a new string, skipping any whitespace that occurs after a comma. 您可以扫描整个字符串并构建一个新字符串,跳过逗号后出现的任何空格。 This would be more efficient than splitting and rejoining. 这比分裂和重新加入更有效。 Something like this should work: 这样的事情应该有效:

String str = /* your original string from the array */;
StringBuilder sb = new StringBuilder();
boolean skip = true;

for (int i = 0; i < str.length(); i++) {
  char ch = str.charAt(i);

  if (skip && Character.isWhitespace(ch))
    continue;

  sb.append(ch);

  if (ch == ',')
    skip = true;
  else
    skip = false;
}

String result = sb.toString();

If you use a regex for you split, you can specify, a comma followed by optional whitespace (which includes spaces and tabs just in case). 如果您使用正则表达式进行拆分,则可以指定逗号后跟可选空格(其中包括空格和制表符以防万一)。

String[] fields = mystring.split(",\\\\s*"); String [] fields = mystring.split(“,\\\\ s *”);

Depending on whether you want to parse each line separately or not you may first want to create an array split on a line return 根据您是否要单独解析每一行,您可能首先要在行返回上创建一个数组拆分

String[] lines = mystring.split("\\\\n"); String [] lines = mystring.split(“\\\\ n”);

Just split() on each line with the delimiter set as ',' to get an array of Strings with the extra whitespace, and then use the trim() method on the elements of the String array, perhaps as they are being used or in advance. 只需在每一行上使用split()并将分隔符设置为','以获取带有额外空格的字符串数组,然后对String数组的元素使用trim()方法,可能正在使用它们或在预先。 Remember that the trim() method gives you back a new string object (a String object is immutable). 请记住,trim()方法为您提供了一个新的字符串对象(String对象是不可变的)。

If I understood your problem, here is a solution: 如果我理解你的问题,这是一个解决方案:

    ArrayList<String> tmp = new ArrayList<String>();
    tmp.add("bob,      jones,    123-333-1111");
    tmp.add("    james, lee,  234-333-2222");

    ArrayList<String> fixedStrings = new ArrayList<String>();

    for (String i : tmp)    {
        System.out.println(i);
        String[] data = i.split(",");

        String result = "";
        for (int j = 0; j < data.length - 1; ++j)   {
            result += data[j].trim() + ", ";
        }

        result += data[data.length - 1].trim();

        fixedStrings.add(result);
    }

    System.out.println(fixedStrings.get(0));
    System.out.println(fixedStrings.get(1));

I guess it could be fixed not to create a second ArrayLis. 我想可以修复不创建第二个ArrayLis。 But it's scalable, so if you get lines in the future like: "bob, jones , bobjones@gmail.com , 123-333-1111 " it will still work. 但它是可扩展的,所以如果你在未来获得如下行: "bob, jones , bobjones@gmail.com , 123-333-1111 "它仍然可以工作。

我使用这个库取得了很大的成功。

you can use Sting.split() method in java or u can use split() method from google guava library's Splitter class as shown below 您可以在java中使用Sting.split()方法,或者您可以使用google guava库的Splitter类中的split()方法,如下所示

static final Splitter MY_SPLITTER = Splitter.on(',') .trimResults() .omitEmptyStrings(); static final Splitter MY_SPLITTER = Splitter.on(',')。trimResults()。omitEmptyStrings();

Could be a bit more elegant, but it works... 可能会更优雅,但它的工作原理......

ArrayList<String> strings = new ArrayList<String>();
strings.add("bob,      jones,    123-333-1111");
strings.add("james, lee,  234-333-2222");

for(int i = 0; i < strings.size(); i++) {
  StringBuilder builder = new StringBuilder();
  for(String str: strings.get(i).split(",\\s*")) {
    builder.append(str).append(" ");
  }
  strings.set(i, builder.toString().trim());
}

System.out.println("strings = " + strings);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM