简体   繁体   中英

datestyle ignore format postgresql

I am trying to ignore an illegally formatted date in a csv file that I am uploading to postgresql through the command line:

Error: date/time field value out of range:"199999999"

The problem is, I cannot change the data in the csv file, so I have to find a way of importing this bad date as is.

Use an intermediate table ( loaded_data ) to store the data you get from you CSV. Make sure all the columns in that table are of type text , so that PostgreSQL will accept virtually anything (unless you have rows with the incorrect number of columns).

Once you have all your data in that table, sanitize all the columns so that when their values are incorrect you either set them to NULL , discard them ( DELETE them) or set those columns to a default value. What you actually do will depend on your particular application.

The simplest (although probably not the fastest) way to sanitize your data is to use a function that CAST s your text to the appropriate type, and handles exceptions if the input is not well formatted. For the case of a date type, you can use the following function:

-- Create a function to get good dates... and return NULL if they're not
CREATE FUNCTION good_date(date_as_text text) 
    RETURNS DATE        /* This is the type of the returned data */
    IMMUTABLE STRICT    /* If you pass a NULL, you'll get a NULL */
    LANGUAGE PLPGSQL    /* Language used to define the function */
AS
$$
BEGIN
    RETURN CAST(date_as_text AS DATE) ;
EXCEPTION WHEN OTHERS THEN  /* If something is wrong... */
    RETURN NULL ;
END
$$ ;

Note that this function's behaviour will depend on your settings for datestyle . However, it will work always with texts like January 8, 1999 , and will return NULL for dates such as 2017-02-30 or February 30, 2017 .

You'll do the equivalent for a good_integer function.


Let's assume you have this input data:

CREATE TABLE loaded_data
(
    some_id text,
    some_date text
) ;

-- Let's assume this is the equivalent of loading the CSV...
INSERT INTO loaded_data
    (some_id, some_date)
VALUES
    (1, '20170101'),
    (2, '19999999'),
    (3, 'January 1, 1999'),
    (4, 'February 29, 2001'),
    (5, '20170230');

... and that you want to store this information in the following table:

CREATE TABLE destination_table
( 
    id integer PRIMARY KEY,
    a_date date
) ;

... you'd use:

INSERT INTO destination_table
    (id, a_date)
SELECT
    good_integer(some_id) AS id, good_date(some_date) AS a_date
FROM
    loaded_data ;

And you'd get:

SELECT * FROM destination_table;
id | a_date    
-: | :---------
 1 | 2017-01-01
 2 | null      
 3 | 1999-01-01
 4 | null      
 5 | null

Check all the setup at dbfiddle here


Alternative: use some ETL tool] that can perform equivalent functionality. The scenario I presented is, somehow, a very simple LTE (load, transform, extract) equivalent.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM