繁体   English   中英

使用 Java 将制表符分隔的字符串转换为逗号分隔的字符串

[英]Convert tab separated string to comma separated string using Java

我将制表符分隔的文件作为字符串。 在下面的示例中,我有 2 行。 在第一行中,值由制表符和换行符分隔。 第二行不包含换行符。

我在原始数据中也有 header。 我想读取字符串数据 header 和值并将它们转换为 CSV 字符串。 当我使用CSVParser读取此数据时,它没有提供正确的值,因为某些列也用\n (换行符)分隔。

但是,每个“行”都以相同的字符串结尾,即

"test2222"

第一排

"abc"   "cde"   "fhg"   "ijk"   "17/01/23 10:09:50 am"  "test111"   "test2" "Individual"    "Enclosure of Work Areas"       "Highlight aluminium personnel lanyarded into the Haulotte boom lift with a spotter. All tools observed to be lanyarded including protection gear. 
Blue glue asset card observed to be attached to the machinery, 10 year inspection of plant not required due to it being only 3yrs old. Last annual inspection august 2022 and logbook was subsequently observed. 
Plant registration was all observed and the weight loads were all abided by."   "test2222"

第二排

"abc"   "cde"   "fhg"   "ijk"   "17/01/23 10:09:50 am"  "test111"   "test2" "Individual"    "Enclosure of Work Areas"       "1" "0" "Level 79"  "16/01/23 11:12:50 pm"  "Logistics - Construction Personnel & Material Lifts"                   "Schindler lift cages were observed to be free of any loose debris or material that may pose a risk of falling into the lift shaft below. L80 and L79 were observed to be compliant on both sides of the shaft."    "test2222"

字符串形式的数据

String test = "\"abc\"\t\"cde\"\t\"fhg\"\t\"ijk\"\t\"17/01/23 10:09:50 am\"\t\"test111\"\t\"test2\"\t\"Individual\"\t\"Enclosure of Work Areas\"\t\t\"Highlight aluminium personnel lanyarded into the Haulotte boom lift with a spotter. All tools observed to be lanyarded including protection gear. \n" +
            "Blue glue asset card observed to be attached to the machinery, 10 year inspection of plant not required due to it being only 3yrs old. Last annual inspection august 2022 and logbook was subsequently observed. \n" +
            "Plant registration was all observed and the weight loads were all abided by.\"\t\"test2222\"\n" +
            "\"abc\"\t\"cde\"\t\"fhg\"\t\"ijk\"\t\"17/01/23 10:09:50 am\"\t\"test111\"\t\"test2\"\t\"Individual\"\t\"Enclosure of Work Areas\"\t\t\"1\"\t\"0\"\t\"Level 79\"\t\"16/01/23 11:12:50 pm\"\t\"Logistics - Construction Personnel & Material Lifts\"\t\t\t\t\t\"Schindler lift cages were observed to be free of any loose debris or material that may pose a risk of falling into the lift shaft below. L80 and L79 were observed to be compliant on both sides of the shaft.\"\t\"test2222\"";

谁能帮我解决这个问题。 提前致谢!!

您可以使用扫描仪来读取文件。 将分隔符设置为终止文件中“行”的字符串。 下面代码演示。

import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Scanner;

public class Main {

    public static void main(String[] args) {
        Path source = Paths.get("sampldat.txt");
        try (Scanner reader = new Scanner(source)) {
            reader.useDelimiter("\"test2222\"");
            int counter = 0;
            while (reader.hasNext()) {
                String line = reader.next();
                String[] fields = line.split("\\s+");
                System.out.println("Row: " + ++counter);
                System.out.println(String.join(",", fields));
            }
        }
        catch (IOException xIo) {
            xIo.printStackTrace();
        }
    }
}

文件sampldat.txt的内容是您问题中的两个示例“行”,即

"abc"   "cde"   "fhg"   "ijk"   "17/01/23 10:09:50 am"  "test111"   "test2" "Individual"    "Enclosure of Work Areas"       "Highlight aluminium personnel lanyarded into the Haulotte boom lift with a spotter. All tools observed to be lanyarded including protection gear. 
Blue glue asset card observed to be attached to the machinery, 10 year inspection of plant not required due to it being only 3yrs old. Last annual inspection august 2022 and logbook was subsequently observed. 
Plant registration was all observed and the weight loads were all abided by."   "test2222"
"abc"   "cde"   "fhg"   "ijk"   "17/01/23 10:09:50 am"  "test111"   "test2" "Individual"    "Enclosure of Work Areas"       "1" "0" "Level 79"  "16/01/23 11:12:50 pm"  "Logistics - Construction Personnel & Material Lifts"                   "Schindler lift cages were observed to be free of any loose debris or material that may pose a risk of falling into the lift shaft below. L80 and L79 were observed to be compliant on both sides of the shaft."    "test2222"

我使用try-with-resources来确保文件已关闭。

next方法将读取下一次出现的定界符,即字符串"test2222"

然后,我将next方法返回的值拆分为 [any] 空格,即空格、制表符、换行符等。

然后,我调用 [static] 方法join (属于 class java.lang.String )以根据您的要求创建一个以逗号分隔的列表。

我在打印输出中添加了一个 header 指示打印的“行”。

这是我得到的 output:

Row: 1
"abc","cde","fhg","ijk","17/01/23,10:09:50,am","test111","test2","Individual","Enclosure,of,Work,Areas","Highlight,aluminium,personnel,lanyarded,into,the,Haulotte,boom,lift,with,a,spotter.,All,tools,observed,to,be,lanyarded,including,protection,gear.,Blue,glue,asset,card,observed,to,be,attached,to,the,machinery,,10,year,inspection,of,plant,not,required,due,to,it,being,only,3yrs,old.,Last,annual,inspection,august,2022,and,logbook,was,subsequently,observed.,Plant,registration,was,all,observed,and,the,weight,loads,were,all,abided,by."
Row: 2
,"abc","cde","fhg","ijk","17/01/23,10:09:50,am","test111","test2","Individual","Enclosure,of,Work,Areas","1","0","Level,79","16/01/23,11:12:50,pm","Logistics,-,Construction,Personnel,&,Material,Lifts","Schindler,lift,cages,were,observed,to,be,free,of,any,loose,debris,or,material,that,may,pose,a,risk,of,falling,into,the,lift,shaft,below.,L80,and,L79,were,observed,to,be,compliant,on,both,sides,of,the,shaft."

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM