将 CSV 导入 postgreSQL 中的表，忽略重复项 - Amazon AWS/RDS

Question

I have a PostgreSQL hosted on AWS (RDS).我有一个 PostgreSQL 托管在 AWS (RDS) 上。 I've created a couple tables and imported some.csv files to this tables using the "Import/Export" tool on PgAdmin4.我创建了几个表并使用 PgAdmin4 上的“导入/导出”工具将一些 .csv 文件导入到该表中。

Monthly I´ll need to update the data on my tables, and I'll do that by uploading.csv files.每个月我都需要更新表格中的数据，我将通过上传 .csv 文件来完成。

The issue that I'm facing right now is: I am trying to insert new data on a table from a.csv file, but I need to ignore the duplicate values .我现在面临的问题是：我正在尝试从 a.csv 文件中的表中插入新数据，但我需要忽略重复值。

I have found a way to do that here (code below ) but the copy command does not work on PgAdmin.我在这里找到了一种方法（下面的代码），但是copy命令在 PgAdmin 上不起作用。 Copy only works if I use the import/export tool.复制仅在我使用导入/导出工具时有效。

CREATE TEMP TABLE tmp_table 
ON COMMIT DROP
AS
SELECT * 
FROM indice-id-cnpj
WITH NO DATA;

COPY tmp_table FROM 'C:/Users/Win10/Desktop/Dados/ID-CNPJ.csv';

INSERT INTO indice-id-cnpj
SELECT *
FROM tmp_table
ON CONFLICT DO NOTHING

This is my first experience with PostgreSQL (apart from a subject in uni).这是我对 PostgreSQL 的第一次体验（除了 uni 的一个主题）。 I can deal with the issue by using excel and doing a little manual work, but I'm looking for a " long term " solution, on how to keep updating the tables using the.csv files, always ignoring the duplicates.我可以通过使用 excel 并做一些手工工作来处理这个问题，但我正在寻找一个“长期”解决方案，关于如何使用 .csv 文件继续更新表格，始终忽略重复项。

Thanks in advance.提前致谢。

Answer 1

So, I´ve found a solution.所以，我找到了解决方案。

As Adrian mentioned, I had to use psql.正如 Adrian 提到的，我不得不使用 psql。

CREATE TEMP TABLE tmp_table AS SELECT * FROM table-name WITH NO DATA;
\copy tmp_table FROM 'C:/Users/Win10/folder/filename.csv' DELIMITER ',' CSV ENCODING 'UTF8' ;


INSERT INTO "table-name" SELECT * FROM tmp_table ON CONFLICT DO NOTHING;
DROP TABLE tmp_table;

Since I´m using psql it´s necessary to use the command \copy instead of COPY .由于我使用的是 psql，因此有必要使用命令\copy而不是COPY 。 Also, every command must finish with a ";"此外，每个命令都必须以“;”结尾and it´s necessary to drop the tmp_table at the end.并且有必要在最后删除tmp_table 。

将 CSV 导入 postgreSQL 中的表，忽略重复项 - Amazon AWS/RDS

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-10-05 12:27:21

将 CSV 导入 postgreSQL 中的表，忽略重复项 - Amazon AWS/RDS

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-10-05 12:27:21

解决方案1
0 已采纳 2021-10-05 12:27:21