如何处理来自 Informatica 中 CSV 的不正确数据

Question

I have source file (CSV) and need to load into target (Oracle).我有源文件 (CSV)，需要加载到目标 (Oracle)。 But I got an error但我有一个错误

FR_3065 ROW[4],Filed [Student_rollnumber]:Invalid Number:[.].The row will be skipped FR_3065 ROW[4]，归档[Student_rollnumber]：无效编号：[.]。将跳过该行

CSV TABL CSV 表

Student_rollnumber,Studnet_Name,Marks,Subjects
10,'Revanth',70,"Maths",
11,'Satish',85,Science
12,'Anil',75,"Java
",
13,'Surya',90,"C++",
14,'Ramana',85,"python",
15,'Sudheer'70,"Informatica
",
16,'Prakash',85,"SQL"

I found that in line number 4 the qouts and comma(",) are in the next line how to concat that both ("Java",) And make it single column(Subject)我发现在第 4 行中，qouts 和逗号（“，）在下一行如何连接它们（“Java”，）并使其成为单列（主题）

Answer 1

MatchQuotesPastEndOfLine mentioned by Koushik should work. MatchQuotesPastEndOfLine提到的 MatchQuotesPastEndOfLine 应该可以工作。

Alternatively you may use sed with below pattern to replace newline+" with simply just a " - as a result removing the new line at the end of quoted string.或者，您可以使用 sed 和下面的模式来替换newline+" ，只需一个" - 结果删除引用字符串末尾的新行。

sed ':a;N;$!ba;s/\n"/"/g'

Feel free to test this gist .随意测试这个要点。

This however will remove just the ending new line and will not help if it's anywhere in the middle.但是，这只会删除结尾的新行，如果它位于中间的任何位置，则无济于事。 As said, the MatchQuotesPastEndOfLine mentioned by Koushik is the best possible solution.如前所述， MatchQuotesPastEndOfLine提到的 MatchQuotesPastEndOfLine 是最好的解决方案。

Above has been based on this question .以上一直基于这个问题。

如何处理来自 Informatica 中 CSV 的不正确数据

问题描述

1 个解决方案

解决方案1
0 2022-08-08 10:58:26

如何处理来自 Informatica 中 CSV 的不正确数据

问题描述

1 个解决方案

解决方案1 0 2022-08-08 10:58:26

解决方案1
0 2022-08-08 10:58:26