简体   繁体   English

使用 Java 将制表符分隔的字符串转换为逗号分隔的字符串

[英]Convert tab separated string to comma separated string using Java

I have tab separated file as string.我将制表符分隔的文件作为字符串。 In the below example I have 2 lines.在下面的示例中,我有 2 行。 In the first line values are split by both tab and newline characters.在第一行中,值由制表符和换行符分隔。 The second line does not contain newline characters.第二行不包含换行符。

I have the header as well in the original data.我在原始数据中也有 header。 I want to read the string data both header and values and convert them to a CSV string.我想读取字符串数据 header 和值并将它们转换为 CSV 字符串。 When I read this data line by line using CSVParser , it's not providing the proper value, because some columns are split with \n (newline) as well.当我使用CSVParser读取此数据时,它没有提供正确的值,因为某些列也用\n (换行符)分隔。

However, each "row" is terminated by the same string, ie但是,每个“行”都以相同的字符串结尾,即

"test2222"

1st row第一排

"abc"   "cde"   "fhg"   "ijk"   "17/01/23 10:09:50 am"  "test111"   "test2" "Individual"    "Enclosure of Work Areas"       "Highlight aluminium personnel lanyarded into the Haulotte boom lift with a spotter. All tools observed to be lanyarded including protection gear. 
Blue glue asset card observed to be attached to the machinery, 10 year inspection of plant not required due to it being only 3yrs old. Last annual inspection august 2022 and logbook was subsequently observed. 
Plant registration was all observed and the weight loads were all abided by."   "test2222"

2nd row第二排

"abc"   "cde"   "fhg"   "ijk"   "17/01/23 10:09:50 am"  "test111"   "test2" "Individual"    "Enclosure of Work Areas"       "1" "0" "Level 79"  "16/01/23 11:12:50 pm"  "Logistics - Construction Personnel & Material Lifts"                   "Schindler lift cages were observed to be free of any loose debris or material that may pose a risk of falling into the lift shaft below. L80 and L79 were observed to be compliant on both sides of the shaft."    "test2222"

Data as String字符串形式的数据

String test = "\"abc\"\t\"cde\"\t\"fhg\"\t\"ijk\"\t\"17/01/23 10:09:50 am\"\t\"test111\"\t\"test2\"\t\"Individual\"\t\"Enclosure of Work Areas\"\t\t\"Highlight aluminium personnel lanyarded into the Haulotte boom lift with a spotter. All tools observed to be lanyarded including protection gear. \n" +
            "Blue glue asset card observed to be attached to the machinery, 10 year inspection of plant not required due to it being only 3yrs old. Last annual inspection august 2022 and logbook was subsequently observed. \n" +
            "Plant registration was all observed and the weight loads were all abided by.\"\t\"test2222\"\n" +
            "\"abc\"\t\"cde\"\t\"fhg\"\t\"ijk\"\t\"17/01/23 10:09:50 am\"\t\"test111\"\t\"test2\"\t\"Individual\"\t\"Enclosure of Work Areas\"\t\t\"1\"\t\"0\"\t\"Level 79\"\t\"16/01/23 11:12:50 pm\"\t\"Logistics - Construction Personnel & Material Lifts\"\t\t\t\t\t\"Schindler lift cages were observed to be free of any loose debris or material that may pose a risk of falling into the lift shaft below. L80 and L79 were observed to be compliant on both sides of the shaft.\"\t\"test2222\"";

Can anyone please help me to resolve this.谁能帮我解决这个问题。 Thanks in advance!!提前致谢!!

You can use a Scanner to read the file.您可以使用扫描仪来读取文件。 Set the delimiter to the string that terminates a "row" in your file.将分隔符设置为终止文件中“行”的字符串。 Below code demonstrates.下面代码演示。

import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Scanner;

public class Main {

    public static void main(String[] args) {
        Path source = Paths.get("sampldat.txt");
        try (Scanner reader = new Scanner(source)) {
            reader.useDelimiter("\"test2222\"");
            int counter = 0;
            while (reader.hasNext()) {
                String line = reader.next();
                String[] fields = line.split("\\s+");
                System.out.println("Row: " + ++counter);
                System.out.println(String.join(",", fields));
            }
        }
        catch (IOException xIo) {
            xIo.printStackTrace();
        }
    }
}

The contents of file sampldat.txt are the two sample "row"s from your question, ie文件sampldat.txt的内容是您问题中的两个示例“行”,即

"abc"   "cde"   "fhg"   "ijk"   "17/01/23 10:09:50 am"  "test111"   "test2" "Individual"    "Enclosure of Work Areas"       "Highlight aluminium personnel lanyarded into the Haulotte boom lift with a spotter. All tools observed to be lanyarded including protection gear. 
Blue glue asset card observed to be attached to the machinery, 10 year inspection of plant not required due to it being only 3yrs old. Last annual inspection august 2022 and logbook was subsequently observed. 
Plant registration was all observed and the weight loads were all abided by."   "test2222"
"abc"   "cde"   "fhg"   "ijk"   "17/01/23 10:09:50 am"  "test111"   "test2" "Individual"    "Enclosure of Work Areas"       "1" "0" "Level 79"  "16/01/23 11:12:50 pm"  "Logistics - Construction Personnel & Material Lifts"                   "Schindler lift cages were observed to be free of any loose debris or material that may pose a risk of falling into the lift shaft below. L80 and L79 were observed to be compliant on both sides of the shaft."    "test2222"

I use try-with-resources to make sure that the file is closed.我使用try-with-resources来确保文件已关闭。

Method next will read up to the next occurrence of the delimiter, ie the string "test2222" . next方法将读取下一次出现的定界符,即字符串"test2222"

I then split the value returned by method next by [any] whitespace, ie spaces, tabs, newlines, etc.然后,我将next方法返回的值拆分为 [any] 空格,即空格、制表符、换行符等。

I then call [static] method join (of class java.lang.String ) to create a comma-separated list as you requested.然后,我调用 [static] 方法join (属于 class java.lang.String )以根据您的要求创建一个以逗号分隔的列表。

I added to the printout a header indicating what "row" is printed.我在打印输出中添加了一个 header 指示打印的“行”。

This is the output I get:这是我得到的 output:

Row: 1
"abc","cde","fhg","ijk","17/01/23,10:09:50,am","test111","test2","Individual","Enclosure,of,Work,Areas","Highlight,aluminium,personnel,lanyarded,into,the,Haulotte,boom,lift,with,a,spotter.,All,tools,observed,to,be,lanyarded,including,protection,gear.,Blue,glue,asset,card,observed,to,be,attached,to,the,machinery,,10,year,inspection,of,plant,not,required,due,to,it,being,only,3yrs,old.,Last,annual,inspection,august,2022,and,logbook,was,subsequently,observed.,Plant,registration,was,all,observed,and,the,weight,loads,were,all,abided,by."
Row: 2
,"abc","cde","fhg","ijk","17/01/23,10:09:50,am","test111","test2","Individual","Enclosure,of,Work,Areas","1","0","Level,79","16/01/23,11:12:50,pm","Logistics,-,Construction,Personnel,&,Material,Lifts","Schindler,lift,cages,were,observed,to,be,free,of,any,loose,debris,or,material,that,may,pose,a,risk,of,falling,into,the,lift,shaft,below.,L80,and,L79,were,observed,to,be,compliant,on,both,sides,of,the,shaft."

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM