簡體   English   中英

MariaDB:從CSV導入時,將字符串轉換為int,同時刪除數字中的空格

[英]MariaDB : convert string to int when importing from CSV, while removing spaces in number

我有一個“大” csv文件(大約1GB的數據,3M行)要導入到MariaDB表中。

問題是,幾乎每行的每個字段都被視為字符串。 因此,我必須將“ 1 337”(字符串)轉換為1337(整數)。

這是用於導入表的腳本:

LOAD DATA LOW_PRIORITY LOCAL
    INFILE 'data.txt'
    INTO TABLE `test`.`test_import`
    CHARACTER SET utf8
    FIELDS TERMINATED BY ';'
    OPTIONALLY ENCLOSED BY '"'
    ESCAPED BY '"'
    LINES TERMINATED BY '\r\n'
    (`id`,
        `data`,
        @NumberOne,
        @NumberTwo,
        @NumberThree,
        @NumberFour)
        SET `Number One` = REPLACE(@NumberOne, ' ', ''),
            `Number Two` = REPLACE(@NumberOne, ' ', ''),
            `Number Three` = REPLACE(@NumberOne, ' ', ''),
            `Number Four` = REPLACE(@NumberOne, ' ', '');

使用此腳本,導入低於999的數字沒有問題。但是從1000(在我的csv中寫為"1 000" )開始,我所收到的只是一個警告( Truncated incorrect INTEGER value: '1 000' ),而其中的值1我的數據庫。

當我嘗試這樣做時,“有趣”的事情是:

SET `Number One` = REPLACE(@NumberOne, '1', 'k'),
                `Number Two` = REPLACE(@NumberOne, '1', 'k'),
                `Number Three` = REPLACE(@NumberOne, '1', 'k'),
                `Number Four` = REPLACE(@NumberOne, '1', 'k')

-> REPLACE() 有效 ,“ 1 000”變成“ k 000”。

那么,如何使用REPLACE()刪除數字中的空格? 或者,如何使CAST()/ CONVERT()在“ 1 337”之類的字符串上正常工作?


一些更多的信息。

這是一個新鮮的測試表:

CREATE OR REPLACE TABLE test_spaces_extr (
    `Identifier`   tinytext,
    `First name`   tinytext,
    `Last name`    tinytext,
    `Number One`   int unsigned,
    `Number Two`   int unsigned,
    `Number Three` int unsigned,
    `Number Four`  int unsigned,
    `Number Five`  int unsigned,
    `Number Six`   int unsigned,
    `Number Seven` int unsigned
);

這是導入CSV的腳本:

LOAD DATA LOW_PRIORITY LOCAL
    INFILE 'some_data.txt'
    INTO TABLE `test`.`test_spaces_extr`
    CHARACTER SET utf8
    FIELDS TERMINATED BY ';'
    OPTIONALLY ENCLOSED BY '"'
    ESCAPED BY '"'
    LINES TERMINATED BY '\r\n'
    (`Identifier`,
        `First name`,
        `Last name`,
        @NumberOne,
        @NumberTwo,
        @NumberThree,
        @NumberFour,
        @NumberFive,
        @NumberSix,
        @NumberSeven)
        SET `Number One` = REPLACE(@NumberOne, ' ', ''),
            `Number Two` = REPLACE(@NumberTwo, ' ', ''),
            `Number Three` = REPLACE(@NumberThree, ' ', ''),
            `Number Four` = REPLACE(@NumberFour, ' ', ''),
            `Number Five` = REPLACE(@NumberFive, ' ', ''),
            `Number Six` = REPLACE(@NumberSix, ' ', ''),
            `Number Seven` = REPLACE(@NumberSeven, ' ', '');

這是some_data.txt的全部內容:

"3efa639b3a";"Censored";"Censored";"7 896";"3 468";"3 854";"5 000";"1 234";"9 654";"1 337"

(一行,是。)

結果如下:

"Identifier"    "First name"    "Last name" "Number One"    "Number Two"    "Number Three"  "Number Four"   "Number Five"   "Number Six"    "Number Seven"
"3efa639b3a"    "Censored"  "Censored"  "7896"  "3468"  "3854"  "5000"  "1234"  "9654"  "0"

實際上,“數字”字段在此處變為整數。 所有這些,但不是最后一個(“數字7”->“ 0”)。

越來越奇怪了...

我無法重現該問題:

$ mysql -u user -p --column-type-info
MariaDB [(none)]> SELECT VERSION();
Field   1:  `VERSION()`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       VAR_STRING
Collation:  utf8_general_ci (33)
Length:     72
Max_length: 24
Decimals:   31
Flags:      NOT_NULL 


+-----------------+
| VERSION()       |
+-----------------+
| 10.0.31-MariaDB |
+-----------------+
1 row in set (0.00 sec)

MariaDB [(none)]> SELECT CAST(REPLACE('1 337', ' ', '') AS UNSIGNED);
Field   1:  `CAST(REPLACE('1 337', ' ', '') AS UNSIGNED)`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     5
Max_length: 4
Decimals:   0
Flags:      NOT_NULL UNSIGNED BINARY NUM 


+---------------------------------------------+
| CAST(REPLACE('1 337', ' ', '') AS UNSIGNED) |
+---------------------------------------------+
|                                        1337 |
+---------------------------------------------+
1 row in set (0.00 sec)

更新

文件: /path/to/data.csv

"3efa639b3a";"Censored";"Censored";"7 896";"3 468";"3 854";"5 000";"1 234";"9 654";"1 337"
MariaDB [_]> SELECT VERSION();
+-----------------+
| VERSION()       |
+-----------------+
| 10.0.31-MariaDB |
+-----------------+
1 row in set (0.00 sec)

MariaDB [_]> DROP TABLE IF EXISTS `test_spaces_extr`;
Query OK, 0 rows affected (0.07 sec)

MariaDB [_]> CREATE OR REPLACE TABLE `test_spaces_extr` (
    ->     `Identifier`   tinytext,
    ->     `First name`   tinytext,
    ->     `Last name`    tinytext,
    ->     `Number One`   int unsigned,
    ->     `Number Two`   int unsigned,
    ->     `Number Three` int unsigned,
    ->     `Number Four`  int unsigned,
    ->     `Number Five`  int unsigned,
    ->     `Number Six`   int unsigned,
    ->     `Number Seven` int unsigned
    -> );
Query OK, 0 rows affected (0.00 sec)

MariaDB [_]> LOAD DATA LOW_PRIORITY LOCAL INFILE '/path/to/data.csv'
    ->   INTO TABLE `test_spaces_extr`
    ->   CHARACTER SET utf8
    ->   FIELDS TERMINATED BY ';'
    ->   OPTIONALLY ENCLOSED BY '"'
    ->   ESCAPED BY '"'
    ->   LINES TERMINATED BY '\r\n'
    ->   (
    ->     `Identifier`,
    ->     `First name`,
    ->     `Last name`,
    ->     @`NumberOne`,
    ->     @`NumberTwo`,
    ->     @`NumberThree`,
    ->     @`NumberFour`,
    ->     @`NumberFive`,
    ->     @`NumberSix`,
    ->     @`NumberSeven`
    ->   )
    ->   SET
    ->   `Number One` = REPLACE(@`NumberOne`, ' ', ''),
    ->   `Number Two` = REPLACE(@`NumberTwo`, ' ', ''),
    ->   `Number Three` = REPLACE(@`NumberThree`, ' ', ''),
    ->   `Number Four` = REPLACE(@`NumberFour`, ' ', ''),
    ->   `Number Five` = REPLACE(@`NumberFive`, ' ', ''),
    ->   `Number Six` = REPLACE(@`NumberSix`, ' ', ''),
    ->   `Number Seven` = REPLACE(@`NumberSeven`, ' ', '');
Query OK, 1 row affected (0.00 sec)                  
Records: 1  Deleted: 0  Skipped: 0  Warnings: 0

MariaDB [_]> SELECT
    ->   `Identifier`,
    ->   `First name`,
    ->   `Last name`,
    ->   `Number One`,
    ->   `Number Two`,
    ->   `Number Three`,
    ->   `Number Four`,
    ->   `Number Five`,
    ->   `Number Six`,
    ->   `Number Seven`
    -> FROM
    ->   `test_spaces_extr`;
+------------+------------+-----------+------------+------------+--------------+-------------+-------------+------------+--------------+
| Identifier | First name | Last name | Number One | Number Two | Number Three | Number Four | Number Five | Number Six | Number Seven |
+------------+------------+-----------+------------+------------+--------------+-------------+-------------+------------+--------------+
| 3efa639b3a | Censored   | Censored  |       7896 |       3468 |         3854 |        5000 |        1234 |       9654 |         1337 |
+------------+------------+-----------+------------+------------+--------------+-------------+-------------+------------+--------------+
1 row in set (0.00 sec)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM