[英]MariaDB : convert string to int when importing from CSV, while removing spaces in number
我有一個“大” csv文件(大約1GB的數據,3M行)要導入到MariaDB表中。
問題是,幾乎每行的每個字段都被視為字符串。 因此,我必須將“ 1 337”(字符串)轉換為1337(整數)。
這是用於導入表的腳本:
LOAD DATA LOW_PRIORITY LOCAL
INFILE 'data.txt'
INTO TABLE `test`.`test_import`
CHARACTER SET utf8
FIELDS TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY '\r\n'
(`id`,
`data`,
@NumberOne,
@NumberTwo,
@NumberThree,
@NumberFour)
SET `Number One` = REPLACE(@NumberOne, ' ', ''),
`Number Two` = REPLACE(@NumberOne, ' ', ''),
`Number Three` = REPLACE(@NumberOne, ' ', ''),
`Number Four` = REPLACE(@NumberOne, ' ', '');
使用此腳本,導入低於999的數字沒有問題。但是從1000(在我的csv中寫為"1 000"
)開始,我所收到的只是一個警告( Truncated incorrect INTEGER value: '1 000'
),而其中的值1我的數據庫。
當我嘗試這樣做時,“有趣”的事情是:
SET `Number One` = REPLACE(@NumberOne, '1', 'k'),
`Number Two` = REPLACE(@NumberOne, '1', 'k'),
`Number Three` = REPLACE(@NumberOne, '1', 'k'),
`Number Four` = REPLACE(@NumberOne, '1', 'k')
-> REPLACE() 有效 ,“ 1 000”變成“ k 000”。
那么,如何使用REPLACE()刪除數字中的空格? 或者,如何使CAST()/ CONVERT()在“ 1 337”之類的字符串上正常工作?
一些更多的信息。
這是一個新鮮的測試表:
CREATE OR REPLACE TABLE test_spaces_extr (
`Identifier` tinytext,
`First name` tinytext,
`Last name` tinytext,
`Number One` int unsigned,
`Number Two` int unsigned,
`Number Three` int unsigned,
`Number Four` int unsigned,
`Number Five` int unsigned,
`Number Six` int unsigned,
`Number Seven` int unsigned
);
這是導入CSV的腳本:
LOAD DATA LOW_PRIORITY LOCAL
INFILE 'some_data.txt'
INTO TABLE `test`.`test_spaces_extr`
CHARACTER SET utf8
FIELDS TERMINATED BY ';'
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '"'
LINES TERMINATED BY '\r\n'
(`Identifier`,
`First name`,
`Last name`,
@NumberOne,
@NumberTwo,
@NumberThree,
@NumberFour,
@NumberFive,
@NumberSix,
@NumberSeven)
SET `Number One` = REPLACE(@NumberOne, ' ', ''),
`Number Two` = REPLACE(@NumberTwo, ' ', ''),
`Number Three` = REPLACE(@NumberThree, ' ', ''),
`Number Four` = REPLACE(@NumberFour, ' ', ''),
`Number Five` = REPLACE(@NumberFive, ' ', ''),
`Number Six` = REPLACE(@NumberSix, ' ', ''),
`Number Seven` = REPLACE(@NumberSeven, ' ', '');
這是some_data.txt
的全部內容:
"3efa639b3a";"Censored";"Censored";"7 896";"3 468";"3 854";"5 000";"1 234";"9 654";"1 337"
(一行,是。)
結果如下:
"Identifier" "First name" "Last name" "Number One" "Number Two" "Number Three" "Number Four" "Number Five" "Number Six" "Number Seven"
"3efa639b3a" "Censored" "Censored" "7896" "3468" "3854" "5000" "1234" "9654" "0"
實際上,“數字”字段在此處變為整數。 所有這些,但不是最后一個(“數字7”->“ 0”)。
越來越奇怪了...
我無法重現該問題:
$ mysql -u user -p --column-type-info
MariaDB [(none)]> SELECT VERSION();
Field 1: `VERSION()`
Catalog: `def`
Database: ``
Table: ``
Org_table: ``
Type: VAR_STRING
Collation: utf8_general_ci (33)
Length: 72
Max_length: 24
Decimals: 31
Flags: NOT_NULL
+-----------------+
| VERSION() |
+-----------------+
| 10.0.31-MariaDB |
+-----------------+
1 row in set (0.00 sec)
MariaDB [(none)]> SELECT CAST(REPLACE('1 337', ' ', '') AS UNSIGNED);
Field 1: `CAST(REPLACE('1 337', ' ', '') AS UNSIGNED)`
Catalog: `def`
Database: ``
Table: ``
Org_table: ``
Type: LONGLONG
Collation: binary (63)
Length: 5
Max_length: 4
Decimals: 0
Flags: NOT_NULL UNSIGNED BINARY NUM
+---------------------------------------------+
| CAST(REPLACE('1 337', ' ', '') AS UNSIGNED) |
+---------------------------------------------+
| 1337 |
+---------------------------------------------+
1 row in set (0.00 sec)
更新
文件: /path/to/data.csv
"3efa639b3a";"Censored";"Censored";"7 896";"3 468";"3 854";"5 000";"1 234";"9 654";"1 337"
MariaDB [_]> SELECT VERSION();
+-----------------+
| VERSION() |
+-----------------+
| 10.0.31-MariaDB |
+-----------------+
1 row in set (0.00 sec)
MariaDB [_]> DROP TABLE IF EXISTS `test_spaces_extr`;
Query OK, 0 rows affected (0.07 sec)
MariaDB [_]> CREATE OR REPLACE TABLE `test_spaces_extr` (
-> `Identifier` tinytext,
-> `First name` tinytext,
-> `Last name` tinytext,
-> `Number One` int unsigned,
-> `Number Two` int unsigned,
-> `Number Three` int unsigned,
-> `Number Four` int unsigned,
-> `Number Five` int unsigned,
-> `Number Six` int unsigned,
-> `Number Seven` int unsigned
-> );
Query OK, 0 rows affected (0.00 sec)
MariaDB [_]> LOAD DATA LOW_PRIORITY LOCAL INFILE '/path/to/data.csv'
-> INTO TABLE `test_spaces_extr`
-> CHARACTER SET utf8
-> FIELDS TERMINATED BY ';'
-> OPTIONALLY ENCLOSED BY '"'
-> ESCAPED BY '"'
-> LINES TERMINATED BY '\r\n'
-> (
-> `Identifier`,
-> `First name`,
-> `Last name`,
-> @`NumberOne`,
-> @`NumberTwo`,
-> @`NumberThree`,
-> @`NumberFour`,
-> @`NumberFive`,
-> @`NumberSix`,
-> @`NumberSeven`
-> )
-> SET
-> `Number One` = REPLACE(@`NumberOne`, ' ', ''),
-> `Number Two` = REPLACE(@`NumberTwo`, ' ', ''),
-> `Number Three` = REPLACE(@`NumberThree`, ' ', ''),
-> `Number Four` = REPLACE(@`NumberFour`, ' ', ''),
-> `Number Five` = REPLACE(@`NumberFive`, ' ', ''),
-> `Number Six` = REPLACE(@`NumberSix`, ' ', ''),
-> `Number Seven` = REPLACE(@`NumberSeven`, ' ', '');
Query OK, 1 row affected (0.00 sec)
Records: 1 Deleted: 0 Skipped: 0 Warnings: 0
MariaDB [_]> SELECT
-> `Identifier`,
-> `First name`,
-> `Last name`,
-> `Number One`,
-> `Number Two`,
-> `Number Three`,
-> `Number Four`,
-> `Number Five`,
-> `Number Six`,
-> `Number Seven`
-> FROM
-> `test_spaces_extr`;
+------------+------------+-----------+------------+------------+--------------+-------------+-------------+------------+--------------+
| Identifier | First name | Last name | Number One | Number Two | Number Three | Number Four | Number Five | Number Six | Number Seven |
+------------+------------+-----------+------------+------------+--------------+-------------+-------------+------------+--------------+
| 3efa639b3a | Censored | Censored | 7896 | 3468 | 3854 | 5000 | 1234 | 9654 | 1337 |
+------------+------------+-----------+------------+------------+--------------+-------------+-------------+------------+--------------+
1 row in set (0.00 sec)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.