簡體   English   中英

如何將多行文件轉換為以控制字符分隔的單行文件

[英]How to make a multiline file into a single line file delimited with a control character

我正在嘗試解析這樣的文件:

textfile.txt

_=1406048396605
bh=1244
bw=1711
c=24
c19=DashboardScreen
c2=2014-07-22T10:00:00-0700
c4=64144090210294
c40=3#undefined#0#0#a=-2#512#-1#0
c41=14060470498427c3e4ed
c46=Green|firefox|Firefox|30|macx|Mac OS X
c5=NONFFA
c6=HGKhjgj
c7=OFF_SEASON|h:PARTIAL|
ch=YHtgsfT
g=https://google.hello.com
h5=77dbf90c-5794-4a40-b1ab-fe1c82440c68-1406048401346
k=true
p=Shockwave Flash;QuickTime Plug-in 7.7.3;Default Browser Helper;SharePoint Browser Plug-in;Java Applet Plug-in;Silverlight Plug-In
pageName=DashboardScreen - Loading...
pageType= 
pe=lnk_o
pev2=pageDetail
s=2432x1520
server=1.1 pqalmttws301.ie.google.net:81
t=22/06/2014 10:00:00 2 420
v12=3468337910
v4=0
v9=dat=279333:279364:375870:743798:744035:743802:744033:743805:783950:783797:783949:784088
vid=29E364C5051D2894-400001468000F0EE

變成這樣的東西:

_=1406048396605<CONTROL_CHARACTER_HERE>bh=1244<CONTROL_CHARACTER_HERE>bw=1711<CONTROL_CHARACTER_HERE>c=24<CONTROL_CHARACTER_HERE>c19=DashboardScreenc2=2014-07-22T10:00:00-0700.....etc

因此,我基本上是將一個多行文件放入一個單獨的行文件中,並用CONTROL_CHARACTER分隔每個字段。

這是我目前擁有的:

private String putIntoExpectedFormat() { 

    File f1 = new File("InputFile.txt");
    File f2 = new File("OutputFile.txt"); 

    InputStream in = new FileInputStream(f1);
    OutputStream out = new FileOutputStream(f2); 

    StringBuilder sb = new StringBuilder();

    byte[] buf = new byte[1024];
    int len;

    while( (len=in.read(buf)) > 0) {



        out.write(buf,0,len);
    }

    in.close();
    out.close();

}

我什至不確定我是否做對了。 有人知道怎么做這個嗎?

由於它是一個文本文件,因此您必須使用Reader類來讀取字符流。 為了獲得更好的性能,請使用BufferedReader

從字符輸入流中讀取文本,緩沖字符,以便有效讀取字符,數組和行。

您可以使用Java 7-try-with-resources語句

樣例代碼:

try (BufferedReader reader = new BufferedReader(new FileReader(
        new File("InputFile.txt")));
     BufferedWriter writer = new BufferedWriter(new FileWriter(
        new File("OutputFile.txt")))) {
    String line = null;
    while ((line = reader.readLine()) != null) {
        writer.write(line);
        // write you <CONTROL_CHARACTER_HERE> as well
    }
}

最簡單的方法是使用ScannerPrintWriter

    Scanner in = null;
    PrintWriter out = null;
    try {
        // init input, output
        in = new Scanner(new File("InputFile.txt"));
        out = new PrintWriter(new File("OutputFile.txt"));
        // read input file line by line
        while (in.hasNextLine()) {
            out.print(in.nextLine());
            if (in.hasNextLine()) {
                out.print("<CONTROL_CHARACTER>");
            }
        }
    } finally {
        // close input, output
        if (in != null) {
            in.close();
        }
        if (out != null) {
            out.close();
        }
    }

這是三段代碼,它們將讀取文件,並用<CONTROL_CHARACTER>替換所有換行符,然后寫入文件。

讀取文件:

public static String readFile(String filePath) {
    String entireFile = "";

    File file = new File(filePath);

    if (file.exists()) {
        BufferedReader br;
        try {
            br = new BufferedReader(new FileReader(file));

            String line;
            while ((line = br.readLine()) != null) {
                entireFile += line + "\n";
            }

            br.close();

        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    } else {
        System.err.println("File " + filePath + " does not exist!");
    }

    return entireFile;
}

將換行符更改為<Control-Character>

String text = readFile("Path/To/file.txt");
text = text.replace("\n", <Control-Character-Here>);

寫入文件:

writeToFile("Path/to/newfile.txt", text);

這是方法writeToFile()

public static void writeToFile(String filePath, String toWrite) {
    File file  = new File(filePath);
    if (!file.exists()) {
        try {
            file.createNewFile();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            System.err.println(filePath + " does not exist. Failed to create new file");
        }
    }

    try {
        PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(filePath, true)));
        out.println(toWrite);
        out.close();
    } catch (IOException e) {
        System.err.println("Could not write to file: " + filePath);
    }
}
  • 用的番石榴17.0
  • 對於小尺寸文件很有用。 尚未針對大型和超大型文件進行測試。我認為按照問題考慮,預期的輸入文件很小。
  • 這里我們不處理每行,因此不需要逐行讀取。

使用Guava IO庫的另一種方法

    public static void main(String[] args) {
        try {
            String content = Files.toString(new File("/home/chandrayya/InputFile.txt"), Charsets.UTF_8);//Change charset accordingly
            content = content.replaceAll("\r\n"/*\r\n windows format, \n UNIX/OSX format \r old mac format*/, "<C>"/*C is control character.*/);
            Files.write(content, new File("/home/chandrayya/OutputFile.txt.txt"), Charsets.UTF_8 );
        } catch( IOException e ) {
            e.printStackTrace();
        }
    }

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM