简体   繁体   English

高效数据导入PostgreSQL DB

[英]Efficient data import PostgreSQL DB

I just designed a Pg database and need to choose a way of populating my DB with data, the data consists of txt and csv files but can generally be any type of file containing characters with delimiters, I'm programming in java in order to the data to have the same structure (there's lots of different kinds of files and I need to find what each column of the file represents so I can associate it with a column of my DB) I thought of two ways:我刚刚设计了一个 Pg 数据库,需要选择一种用数据填充我的数据库的方式,数据由 txt 和 csv 文件组成,但通常可以是包含带分隔符的字符的任何类型的文件,我正在用 java 编程以便数据具有相同的结构(有很多不同类型的文件,我需要找到文件的每一列代表什么,以便我可以将它与我的数据库的一列相关联)我想到了两种方法:

  • Convert the files into one same type of file (JSON) and then get the DB to regularly check the JSON file and import its content.将文件转换为一种相同类型的文件 (JSON),然后让 DB 定期检查 JSON 文件并导入其内容。

  • Directly connect to the database via JDBC send the strings to the DB (I still need to create a backup file containing what was inserted into the DB so in both cases there is a file created and written into).通过 JDBC 直接连接到数据库将字符串发送到数据库(我仍然需要创建一个备份文件,其中包含插入到数据库中的内容,因此在这两种情况下都会创建并写入一个文件)。

Which would you go with time efficiency wise?在时间效率方面,你会选择哪个? I'm kinda tempted into using the first one as it would be easier to handle a json file in the DB.我有点想使用第一个,因为在数据库中处理 json 文件会更容易。 If you have any other suggestion that would also be welcome!如果您有任何其他建议也将受到欢迎!

JSON or CSV JSON 或 CSV

If you have the liberty of converting your data either to CSV or JSON format, CSV is the one to choose.如果您可以自由地将数据转换为 CSV 或 JSON 格式,那么 CSV 是首选。 This is because you will then be able to use COPY FROM to bulk load large amounts of data at once into postgresql.这是因为您将能够使用COPY FROM一次性将大量数据加载到 postgresql 中。

CSV is supported by COPY but JSON is not. COPY支持 CSV,但 JSON 不支持。

Directly inserting values.直接插入值。

This is the approach to take if you only need to insert a few (or maybe even a few thousand) records but not suited for large number of records because it will be slow.如果您只需要插入几条(甚至几千条)记录但不适合大量记录,因为它会很慢,这是一种方法。

If you choose this approach you can create the back up using COPY TO.如果您选择这种方法,您可以使用 COPY TO 创建备份。 However if you feel that you need to create the backup file with your java code.但是,如果您觉得需要使用 Java 代码创建备份文件。 Choosing the format as CSV means you would be able to bulk load as discussed above.选择格式为 CSV 意味着您将能够如上所述进行批量加载。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM