简体   繁体   中英

How to split up string on first whitespace occurance in PostgreSQL

I have column in PostgreSQL DB and I would like to make a Flyway migration to split the full-name into the first_name and last_name based on the first occurrence of the white-space. There are some rouge rows where there is either only one word or several of them. Eg:

| fullname         |
| ---------------- |
| Admin            |
| Jon Doe          |
| Vincent Van Gogh |

and I want to migrate column fullname into:

| first_name | last_name |
| -----------|---------- |
| Admin      |           |
| John       | Doe       |
| Vincent    | Van Gogh  |

Incorrect solution: I have tried several regular expression in order to find correct regex to split the string on first white-space occurrence. Unfortunately all were unsuccessful. Can anybody help me to find proper regex for splitting the string on the first occurrence of white-space? Or maybe there is better way using other PostgreSQL method than regexp_split_to_array()?

UPDATE users
    SET first_name = (regexp_split_to_array(users.full_name, '\s+'))[1], 
        last_name  = (regexp_split_to_array(users.full_name, '\s+'))[2], 

In case of '\\s+' regex, array is created with 3 elements and to last_name is padded only the 2nd element in case of Vincent Van Gogh.

| first_name | last_name |
| -----------|---------- |
| Admin      |           |
| John       | Doe       |
| Vincent    | Van       | <- Missing Gogh surname

You may use substring :

UPDATE users
    SET first_name = substring(users.full_name, '^\S+'),
        last_name = substring(users.full_name, '^\S+\s+(.*)'),

The ^\\S+ pattern matches 1 or more non-whitespace chars at the start of the string.

The ^\\S+\\s+(.*) pattern matches a string that starts with 1+ non-whitespace chars at the start, then has 1+ whitespace chars, and then captures into Group 1 any amount of 0+ chars. The parenthesized part, the capturing group pattern, is what substring will return.

But if the pattern contains any parentheses, the portion of the text that matched the first parenthesized subexpression (the one whose left parenthesis comes first) is returned.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM