简体   繁体   English

使用 Postgresql 将来自多个 csv 文件的大量数据插入到不同的表中

[英]Inserting huge batch of data from multiple csv files into distinct tables with Postgresql

I have a folder with multiple csv files, they all have the same column attributes.我有一个包含多个 csv 文件的文件夹,它们都具有相同的列属性。

My goal is to make every csv file into a distinct postgresql table named as the file's name but as there are 1k+ of them it would be a pretty long process to do manually.我的目标是将每个 csv 文件制作成一个不同的 postgresql 表,命名为文件名,但由于其中有 1k+ 个,手动完成将是一个相当长的过程。

I've been trying to search a solution for the whole day but the closest I've came up to solving the problem was this code:我一整天都在尝试寻找解决方案,但我最接近解决问题的是这段代码:

for filename in select pg_ls_dir2 ('/directory_name/') loop
    if (filename ~ '.csv$') THEN create table filename as fn
        copy '/fullpath/' || filename to table fn
    end if;
END loop;

the logic behind this code is to select every filename inside the folder, create a table named as the filename and import the content into said table.此代码背后的逻辑是 select 文件夹内的每个文件名,创建一个名为文件名的表并将内容导入该表。

The issue is that I have no idea how to actually put that in practice, for instance where should I execute this code since both for and pg_ls_dir2 are not SQL instructions?问题是我不知道如何将其实际付诸实践,例如我应该在哪里执行此代码,因为forpg_ls_dir2是 SQL 指令?

If you use DBeaver, there is a recently-added feature in the software which fixes this exact issue.如果您使用 DBeaver,则该软件中有一个最近添加的功能可以解决这个问题。 (On Windows) You have to right click the section "Tables" inside your schemas (not your target table.) and then just select "Import data" and you can select all the,csv files you want at the same time. (在 Windows 上)您必须右键单击架构中的“表”部分(而不是目标表。)然后只需 select“导入数据”,您就可以在同一时间 select、Z628CB5675FF524F3E719B7B7 文件。 creating a new table for each file as you mentioned如您所述,为每个文件创建一个新表

Normally, I don' t like giving the answer directly, but I think you will need to change a few things at least.通常,我不喜欢直接给出答案,但我认为您至少需要更改一些内容。

Depending on the example from here I prepared a small example using bash script.根据此处的示例,我使用 bash 脚本准备了一个小示例。 Let' s assume you are in the directory that your files are kept.假设您位于保存文件的目录中。

postgres@213b483d0f5c:/home$ ls -ltr
total 8
-rwxrwxrwx 1 root root 146 Jul 25 13:58 file1.csv
-rwxrwxrwx 1 root root 146 Jul 25 14:16 file2.csv

On the same directory you can run:在同一目录中,您可以运行:

for i in `ls | grep csv`
do
export table_name=`echo $i | cut -d "." -f 1`;
psql -d test -c "CREATE TABLE $table_name(emp_id SERIAL, first_name VARCHAR(50), last_name VARCHAR(50), dob DATE, city VARCHAR(40), PRIMARY KEY(emp_id));";
psql -d test -c "\COPY $table_name(emp_id,first_name,last_name,dob,city) FROM './$i' DELIMITER ',' CSV HEADER;";
done

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM