简体   繁体   English

如何使用引号 Node.js 中的引号和逗号解析 a.csv 文件

[英]How do I parse a .csv file with quotes and commas within the quotes Node.js

I have a solution that parses a csv, however it does not take into account data that has a comma within the quoted field.我有一个解析 csv 的解决方案,但是它没有考虑在引用字段中带有逗号的数据。 (example "not, accounted","normal") (例如“不,会计”,“正常”)

let filePath = Path.resolve(__dirname, `./MyData.csv`);

        let data = fs.readFileSync(filePath, 'utf-8');
        data = data.replace(/"/g, '');
        data = data.split(/\r?\n/);

        for (let i in data) {
            data[i] = data[i].split(",");
        }

        data.forEach(async customerEmailToAdd => {
            if (customerEmailToAdd[0] != 'id') {
                const sql = `
                UPDATE customers
                SET contactEmail = '${customerEmailToAdd[4]}',
                contactName = '${customerEmailToAdd[3]}'
                WHERE Id = '${customerEmailToAdd[0]}';
              `;;
                await queryInterface.sequelize.query(sql);
            };
        });

You issue is that you are trying to use split and replace to parse a.csv and this 2 functions are not a really good idea for this (for a lot of specific cases, like a wild comma in a value).您的问题是您正在尝试使用 split 和 replace 来解析 a.csv 而这 2 个函数对此并不是一个好主意(对于许多特定情况,例如值中的野生逗号)。 You should consider reading the file character by character using a state machine to know what you are reading.您应该考虑使用 state 机器逐个字符地读取文件,以了解您正在阅读的内容。 Because you can also find something like this: "not, \"accounted\""因为你也可以找到这样的东西: "not, \"accounted\""

But, if you want to keep with your current method, you can replace the comma that are between two quotes by a temporary placeholder.但是,如果您想保留当前方法,可以将两个引号之间的逗号替换为临时占位符。 Something like ###COMMA### , just make sure that this placeholder will never appear in a real case.###COMMA###这样的东西,只要确保这个占位符永远不会出现在真实案例中。

You can use the following code for this: data = data.replace(/"(.*?)"/g, (str) => str.replaceAll(',', '###COMMA###'));您可以为此使用以下代码: data = data.replace(/"(.*?)"/g, (str) => str.replaceAll(',', '###COMMA###'));

Then you use split and replace to parse the csv file, and you replace the placeholder by real commas: data = data.replaceAll('###COMA###', ',');然后使用 split 和 replace 解析 csv 文件,并将占位符替换为实逗号: data = data.replaceAll('###COMA###', ',');

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM