简体   繁体   English

如何使用Java从文本文件中提取数据并将其写入CSV文件

[英]How to extract data from a text file and write into CSV file in Java

I have got a text file containing reference, name, address, amount, dateTo, dateFrom and mandatory columns, in the following format: 我有一个文本文件,包含引用,名称,地址,金额,dateTo,dateFrom和必填列,格式如下:

"120030125 J Blog  23, SOME HOUSE,                 259.44  21-OCT-2013  17-NOV-2013"
"                  SQUARE, STREET, LEICESTER,"
                   LE1 2BB

"120030318 R Mxx   37, WOOD CLOSE, BIRMINGHAM,     121.96  16-OCT-2013  17-NOV-2013  Y"                      
"                  STREET, NN18 8DF"

"120012174 JE xx   25, SOME HOUSE, QUEENS          259.44  21-OCT-2013  17-NOV-2013"
"                  SQUARE, STREET, LEICESTER,"
                   LE1 2BB

"100154992 DL x    23, SOME HOUSE, QUEENS          270.44  21-OCT-2013  17-NOV-2013  Y"             
"                  SQUARE, STREET, LEICESTER,"
                   LE1 2BC

I am only interested in the first lines of each string and want to extract the data in the reference, name, amount, dateTo and dateFrom columns and want to write them into a CSV file. 我只对每个字符串的第一行感兴趣,想提取引用,名称,金额,dateTo和dateFrom列中的数据,并将它们写入CSV文件。 Currently I've only been able to write the following code and extract the first lines and get rid of the starting and ending double quotes. 目前,我只能编写以下代码并提取出第一行,并删除了开头和结尾的双引号。 The input file contains white spaces and so does the output file. 输入文件包含空格,输出文件也包含空格。

public class ReadTxt {
    public static void main(String[] args) throws IOException {
        BufferedReader br = new BufferedReader(new FileReader("C:/Users/me/Desktop/input.txt"));
        String pattern = "\"\\d\\d\\d\\d";

        // Create a Pattern object
        Pattern r = Pattern.compile(pattern);
        int i;
        ArrayList<String> list = new ArrayList<String>();

        boolean a = true;
        PrintWriter out = new PrintWriter(new PrintWriter("C:/Users/me/Desktop/Output.txt"), a);

        try {
            String line = br.readLine();

            while (line != null) {
                Matcher m = r.matcher(line);

                if (m.find()) {
                    String temp;
                    temp = line.substring(1, line.length() - 1);
                    list.add(temp);
                }
                else {
                // do nothing
                }

                line = br.readLine();
            }
        }
        finally {
            br.close();
        }

        for (i = 0; i < list.size(); i++) {
            out.println(list.get(i));
        }

        out.flush();
        out.close();
    }
}

The above code will create a text file with the following output: 上面的代码将创建一个带有以下输出的文本文件:

120030125  J Blog   23, SOME HOUSE, QUEENS       259.44  21-OCT-2013  17-NOV-2013
120030318  R Mxx    37, WOOD CLOSE, BIRMINGHAM,  121.96  16-OCT-2013  17-NOV-2013  Y                      
120012174  JE xx    25, SOME HOUSE, QUEENS       259.44  21-OCT-2013  17-NOV-2013
100154992  DL x     23, SOME HOUSE, QUEENS       259.44  21-OCT-2013  17-NOV-2013  Y

My expected output is as following, but into a csv file: 我的预期输出如下,但是存入一个csv文件中:

120030125  J Blog  259.44  21-OCT-2013  17-NOV-2013
120030318  R Mxx   121.96  16-OCT-2013  17-NOV-2013                        
120012174  JE xx   259.44  21-OCT-2013  17-NOV-2013
100154992  DL x    259.44  21-OCT-2013  17-NOV-2013  

Any suggestions, links to tutorials or help would be greatly appreciated as I am not an expert in Java. 因为我不是Java方面的专家,所以任何建议,教程链接或帮助都将不胜感激。 I did try looking up for tutorials on the internet, but could not find any which was useful in my case. 我确实尝试在互联网上查找教程,但是找不到任何对我而言有用的教程。

Here, test this out. 在这里,测试一下。 I just used a array, but you can implement the necessary code into yours. 我只是使用了一个数组,但是您可以在您的代码中实现必要的代码。 I changed some addresses (look at 2nd and 3rd address in the array) to have spaces and no spaces in different locations to test. 我更改了一些地址(查看数组中的第2和第3个地址),以使其具有空格,并且在不同位置没有空格可以测试。

public class SplitData {

    public static void main(String[] args) {
        String[] array = {"120030125  J Blog   23, SOME HOUSE, QUEENS       259.44  21-OCT-2013  17-NOV-2013",
            "120030318  R Mxx    37,WOODCLOSE,BIRMINGHAM,  121.96  16-OCT-2013  17-NOV-2013  Y 0",
            "120012174  JE xx    25, SOME HOUSE,QUEENS       259.44  21-OCT-2013  17-NOV-2013",
            "100154992  DL x     23, SOME HOUSE, QUEENS       259.44  21-OCT-2013  17-NOV-2013  Y"  
        };

        String s1 = null;
        String s2 = null;
        String s3 = null;
        String s4 = null;
        String s5 = null;
        for (String s : array) {
            String[] split = s.split("\\s+");
            s1 = split[0];
            s2 = split[1] + " " + split[2];
            for (String string: split) {
                if (string.matches("\\d+\\.\\d{2}")) {
                    s3 = string;
                    break;
                }
            }
            String[] newArray = s.substring(s.indexOf(s3)).split("\\s+");
            s4 = newArray[1];
            s5 = newArray[2];

            System.out.printf("%s\t%s\t%s\t%s\t%s\n", s1, s2, s3, s4, s5);
        }
    }  
}

Output 产量

120030125   J Blog  259.44  21-OCT-2013 17-NOV-2013
120030318   R Mxx   121.96  16-OCT-2013 17-NOV-2013
120012174   JE xx   259.44  21-OCT-2013 17-NOV-2013
100154992   DL x    259.44  21-OCT-2013 17-NOV-2013
public static void main (String[] args) throws IOException {
  BufferedReader br = new BufferedReader (new FileReader ("D:/input.txt"));
  String pattern = "\"\\d\\d\\d\\d";

  // Create a Pattern object
  Pattern r = Pattern.compile (pattern);
  int i;
  ArrayList<String> list = new ArrayList<String> ();

  boolean a = true;
  PrintWriter out = new PrintWriter (new PrintWriter ("D:/Output.csv"), a);

  try {
      String line = br.readLine ();
      line= line.trim ();
      while (line != null) {
      Matcher m = r.matcher (line);
      if (m.find ()) {
          String temp;
          temp = line.substring (0, 19) + " "
                + line.substring (51, line.length () - 1);          
          temp = temp.replaceAll ("[ ]+", " ").replace ("\"", "");
          String[] array = temp.split ("[ ]");
          temp = array[0] +","+ array[1] +" "+ array[2]+","+ array[3]+","+ array[4]+","+ array[5];
          list.add (temp);
      } else {
          // do nothing
      }

      line = br.readLine ();
      }
  }   finally {
      br.close ();
  }

  for (i = 0; i < list.size (); i++) {
      out.println (list.get (i));
  }

  out.flush ();
  out.close ();
  }

OUTPUT OUTPUT

120030125,J Blog,259.44,21-OCT-2013,17-NOV-2013
120030318,R Mxx,121.96,16-OCT-2013,17-NOV-2013
120012174,JE xx,259.44,21-OCT-2013,17-NOV-2013
100154992,DL x,270.44,21-OCT-2013,17-NOV-2013

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM