简体   繁体   English

如何将csv文件导入具有空值的Postgres?

[英]how import csv file into Postgres with empty values?

I am trying to import one csv file into Postgres which does contain age values, however there are also some empty values, since not all ages are known. 我正在尝试将一个csv文件导入确实包含年龄值的Postgres,但是也有一些空值,因为并非所有年龄都已知。 I would like to import the columns as real, since the columns contain ages with decimals like 98.45. 我想将这些列导入为真实列,因为这些列包含带有小数的年龄,例如98.45。 The empty values for people when age is not known is apparently considered as strings, however I still would like to import the ages values as numbers. 年龄未知的人的空值显然被视为字符串,但是我仍然想将年龄值导入为数字。 So I was wondering how to import real values, even when some cells in the csv are empty and thus are considered according to Postgres as string values. 所以我想知道如何导入真实值,即使csv中的某些单元格为空,因此根据Postgres也将其视为字符串值。

for creation I used the following code, since I am dealing with decimal values. 为了进行创建,我使用了以下代码,因为我正在处理十进制值。

Create table psychosocial.age (
  respnr integer Primary key,
  fage real,
  gage real,
  hage real);

after importing csv file, I get the following error 导入csv文件后,出现以下错误

ERROR:  invalid input syntax for integer: "11455, , , "

CONTEXT:  COPY age, line 2, column respnr: "11455, , , "

One problem is that you're trying to import white spaces into numeric fields. 一个问题是您试图将空格导入数字字段。 So, first you have to pre-process your csv file before importing it. 因此,首先必须在导入之前对csv文件进行预处理。

Below is an example of how you can solve it using awk . 以下是使用awk解决问题的示例。 From your console execute the following command: 从控制台执行以下命令:

$ cat file.csv | awk '{sub(/^ +/,""); gsub(/, /,",")}1' | psql db -c "COPY psychosocial.age FROM STDIN WITH CSV HEADER"

In case you're wondering how to pipe commands, take a look at these answers . 如果您想知道如何传递命令,请查看以下答案 Here a more detailed example on how to use COPY and the STDIN . 这是一个有关如何使用COPYSTDIN的更详细的示例

You also have to take into account that having quotation marks on integer fields can be problematic, eg: 您还必须考虑到整数字段上的引号可能会引起问题,例如:

"11455, , , "

This will result in an error, since postgres will parse "11455 as a single value and will try to store it in an interger field, which will obviously fail. Instead, format your csv file to be like this: 这将导致错误,因为postgres会将"11455解析为单个值,并尝试将其存储在interger字段中,这显然会失败。相反,请格式化csv文件,如下所示:

11455, , , 

or even 甚至

11455,,,

You can achieve this also using awk from your console: 您也可以从控制台使用awk实现此目的:

$ awk '{gsub(/\"/,"")};1' file.csv                    

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM