[英]Import multiple CSV files to mysql database and create tables for them
我有一個包含數百個 csv 文件的文件夾。 每個文件都有日期,因為我的目錄中的數據每天都在創建,例如 2020-01-15.csv、2020-01-16.csv、2020-01-17.csv 等。我正在尋找一個每天將我的文件導入 mysql 數據庫並為每個文件創建表的最佳方法(如果文件名表已存在,則無需創建表)。
到目前為止,為了將我的文件導入到 mysql 數據庫中,我使用了mysqlimport
,但當時我用它來導入一個文件,這次看起來我對這個功能的了解還不夠,這就是我所用的到目前為止在 bash 中嘗試過:
mysqlimport -h localhost -umyusername -pmypassword database_name /path/to/my/data/*.csv
收到錯誤:
mysqlimport: Error: 1146, Table 'database_name.2020-01-15' doesn't exist, when using table: 2020-01-15
有人可以幫我解決這個問題嗎? 在 python 中會有更簡單的方法嗎? 提前致謝。
文件單文件結構:
['date,id,name,gsmCount,userCount,regionCount\n',
'2020-01-25,g45ddf-54fdfd4,GammaY,22142,3212,132\n',
'2020-01-25,g412ddf-54re321d4,BetaT,351871,734,67\n',
'2020-01-25,fsdsf579hhh-fgd4,LambdaD,367,41,7\n']
所以這是我當前的腳本:
#!/bin/bash
# show commands being executed, per debug
set -x
# define database connectivity
_db="mydatabasename"
_db_user="myusername"
_db_password="mypassword"
# define directory containing CSV files
_csv_directory="/path/to/my/data"
# go into directory
cd $_csv_directory || exit
# edit file name
rename "s/ //g" *.csv
rename "s/^/tp/g" *.csv
# get a list of CSV files in directory
_csv_files=`ls -1 *.csv`
# loop through csv files
for _csv_file in ${_csv_files[@]}
do
# remove file extension
_csv_file_extensionless=`echo "$_csv_file" | sed 's/\(.*\)\..*/\1/'`
# define table name
_table_name="${_csv_file_extensionless}"
# get header columns from CSV file
_header_columns=`head -1 $_csv_directory/$_csv_file | tr ',' '\n' | sed 's/"//' | sed 's/ /_/g'`
_header_columns_string=`head -1 $_csv_directory/$_csv_file | sed 's/ /_/g' | sed 's/"//g' | sed 's/(//g' | sed 's/)//g'`
# ensure table exists
mysql -u $_db_user -p$_db_password $_db << eof
CREATE TABLE IF NOT EXISTS \`$_table_name\` ENGINE=MyISAM DEFAULT CHARSET=utf8
eof
# loop through header columns
for _header in "${_header_columns[@]}"
do
# add column
mysql -u $_db_user -p$_db_password $_db --execute="alter table \`$_table_name\` add column IF NOT EXISTS \`$_header\` text"
done
# import csv into mysql
mysqlimport --fields-enclosed-by='"' --fields-terminated-by=',' -- lines-terminated-by="\n" --columns=$_header_columns_string -u $_db_user - p$_db_password $_db $_csv_directory/$_csv_file
done
exit
這是我在運行上述內容時收到的錯誤:
myserver:~ user_name$ bash -c -l "/path/to/my/script/uploadmysql.sh"
+ _db=mydatabasename
+ _db_user=myusername
+ _db_password=mypassword
+ _csv_directory=/path/to/my/data
+ cd /path/to/my/data
+ rename 's/ //g' 2020-01-25.csv 2020-01-26.csv 2020-01-27.csv
/path/to/my/script/uploadmysql.sh: line 19: rename: command not found
+ rename 's/^/tp/g' 2020-01-25.csv 2020-01-26.csv 2020-01-27.csv
/path/to/my/script/uploadmysql.sh: line 20: rename: command not found
++ ls -1 2020-01-25.csv 2020-01-26.csv 2020-01-27.csv
+ _csv_files='2020-01-25.csv
2020-01-26.csv
2020-01-27.csv'
+ for _csv_file in '${_csv_files[@]}'
++ echo 2020-01-25.csv
++ sed 's/\(.*\)\..*/\1/'
+ _csv_file_extensionless=2020-01-25
+ _table_name=2020-01-25
++ head -1 /path/to/my/data/2020-01-25.csv
++ tr , '\n'
++ sed 's/"//'
++ sed 's/ /_/g'
+ _header_columns='date
id
Name
gsmCount
userCount
regionCount'
++ head -1 /path/to/my/data/2020-01-25.csv
++ sed 's/ /_/g'
++ sed 's/"//g'
++ sed 's/(//g'
++ sed 's/)//g'
+ _header_columns_string=date,id,Name,gsmCount,userCount,regionCount
+ mysql -u myusername -pmypassword mydatabase
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1113 (42000) at line 1: A table must have at least 1 column
+ for _header in '"${_header_columns[@]}"'
+ mysql -u myusername -pmypassword mydatabase '--execute=alter table `2020-01-25` add column IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text'
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text' at line 1
+ mysqlimport '--fields-enclosed-by="' --fields-terminated-by=, '--lines-terminated-by=\n' -- columns=date,id,Name,gsmCount,userCount,regionCount -u myusername - pmypassword mydatabase /path/to/my/data/2020-01-25.csv
mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqlimport: Error: 1146, Table 'mydatabase.2020-01-25' doesn't exist, when using table: 2020-01-25
+ for _csv_file in '${_csv_files[@]}'
++ echo 2020-01-26.csv
++ sed 's/\(.*\)\..*/\1/'
+ _csv_file_extensionless=2020-01-26
+ _table_name=2020-01-26
++ head -1 /path/to/my/data/2020-01-26.csv
++ tr , '\n'
++ sed 's/"//'
++ sed 's/ /_/g'
+ _header_columns='date
id
Name
gsmCount
userCount
regionCount'
++ head -1 /path/to/my/data/2020-01-26.csv
++ sed 's/ /_/g'
++ sed 's/"//g'
++ sed 's/(//g'
++ sed 's/)//g'
+ _header_columns_string=date,id,Name,gsmCount,userCount,regionCount
+ mysql -u myusername -pmypassword mydatabase
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1113 (42000) at line 1: A table must have at least 1 column
+ for _header in '"${_header_columns[@]}"'
+ mysql -u myusername -pmypassword mydatabase '--execute=alter table `2020-01-26` add column IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text'
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text' at line 1
+ mysqlimport '--fields-enclosed-by="' --fields-terminated-by=, '-- lines-terminated-by=\n' -- columns=date,id,Name,gsmCount,userCount,regionCount -u myusername - pmypassword mydatabase /path/to/my/data/2020-01-26.csv
mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqlimport: Error: 1146, Table 'mydatabase.2020-01-26' doesn't exist, when using table: 2020-01-26
+ for _csv_file in '${_csv_files[@]}'
++ echo 2020-01-27.csv
++ sed 's/\(.*\)\..*/\1/'
+ _csv_file_extensionless=2020-01-27
+ _table_name=2020-01-27
++ head -1 /path/to/my/data/2020-01-27.csv
++ tr , '\n'
++ sed 's/"//'
++ sed 's/ /_/g'
+ _header_columns='date
id
Name
gsmCount
userCount
regionCount'
++ head -1 /path/to/my/data/2020-01-27.csv
++ sed 's/ /_/g'
++ sed 's/"//g'
++ sed 's/(//g'
++ sed 's/)//g'
+ _header_columns_string=date,id,Name,gsmCount,userCount,regionCount
+ mysql -u myusername -pmypassword mydatabase
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1113 (42000) at line 1: A table must have at least 1 column
+ for _header in '"${_header_columns[@]}"'
+ mysql -u myusername -pmypassword mydatabase '--execute=alter table `2020-01-27` add column IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text'
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text' at line 1
+ mysqlimport '--fields-enclosed-by="' --fields-terminated-by=, '--lines-terminated-by=\n' --columns=date,id,Name,gsmCount,userCount,regionCount -u myusername - pmypassword mydatabase /path/to/my/data/2020-01-27.csv
mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqlimport: Error: 1146, Table 'mydatabase.2020-01-27' doesn't exist, when using table: 2020-01-27
+ exit
仍在嘗試消除錯誤並將多個 CSV 文件作為表導入 mysql。 有人可以給我一個提示如何解決這些問題嗎? 提前致謝
execsql.py ( https://pypi.org/project/execsql/ ) 的這個示例顯示了如何獲取目錄中的所有文件名,遍歷它們,並將每個文件名導入到自己的表中: http://execsql .osdn.io/examples.html#example-13-import-all-the-csv-files-in-a-directory 。 這個例子是為 Postgres 而不是 MySQL 編寫的,並將表放在一個暫存目錄中(名為“暫存”),但它可以很容易地修改為與 MySQL 一起使用。
免責聲明:我寫了 execsql。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.