简体   繁体   中英

How to Import Multiple csv files into a MySQL Database

Is there a way to import multiple csv files at the same time into a MySQL database? Some sort of batch import?

I'm on Mac OSX running a MAMP server.

I have 185 csv files that I need to import into a MySQL table. I can import them individually using phpMyAdmin's import tab, but it would take a long time. Does anyone know if there is a better way?

Use a shell script like this:

#!/usr/bin/env bash
cd yourdirectory
for f in *.csv
do
        mysql -e "USE yourDatabase LOAD DATA LOCAL INFILE '"$f"'INTO TABLE yourtable"
done

There's a little PHP script for you:

#!/usr/bin/php
<?
mysql_connect('localhost','root','root'); // MAMP defaults
mysql_select_db('yourdatabase');
$files = glob('*.csv');
foreach($files as $file){
    mysql_query("LOAD DATA INFILE '".$file."' INTO TABLE yourtable");
}

See the MySQL Manual for LOAD DATA INFILE options which fit your documents.

You could use a shell script to loop through the files (this one assumes they're in the current directory):

#!/bin/bash

for f in *.csv
do
    mysql -e "load data infile '"$f"' into table my_table" -u username --password=your_password my_database
done

I've modified Tom's script to solve few issues that faced

#!/bin/bash

for f in *.csv
do
    mysql -e "load data local infile '"$f"' into table myTable fields TERMINATED BY ',' LINES TERMINATED BY '\n'"  -u myUser--password=myPassword fmeter --local-infile
done
  1. load data local infile instead of load data infile : [file to be loaded was local to mysql server]
  2. Added delimiter switches to match my data.
  3. --local-infile to enabled local data load mode on client.

For windows User use this batch

echo off
setlocal enabledelayedexpansion
FOR %%f IN ("*.csv") DO (
  set old=%%~dpnxf
  set new=!old:\=\\!
  mysql -e "load data local infile '"!new!"' IGNORE into table email_us.business  COLUMNS TERMINATED BY ','" -u root
  echo %%~nxf DONE
)
  • email_us -> DB
  • business -> Table
  • IGNORE -> Ignore duplicate insert and on error keep continue
  • ~dpnxf -> d for drive letter, p for path to file, n for filename, x for extension and f is file variable

Steps: - Put that batch file in directory where all multiple csv files exist and named it as something.bat - run cmd.exe as adminstrator and call that something.bat file and enjoy importing...

In python you can use d6tstack which makes this simple

import d6tstack
import glob

c = d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv'))
c.to_mysql_combine('mysql+mysqlconnector://usr:pwd@localhost/db', 'tablename')

It also deals with data schema changes , creates table and allows you to preprocess data.

i had the same task to do with a lot of CSV files and create one table by CSV, so here is my script that i use in local under XAMP.

<?php
ini_set('display_errors',1);
echo '### Begin Importation<br>';

$mysqli  =  new mysqli(
"localhost",
"root",
"",
"mydatabase",
3306
);

if (mysqli_connect_errno()) {
    printf("Connect failed: %s\n", mysqli_connect_error());
    exit();
}

$files = glob('C:\\xampp\\mysql\\data\\mev2\\*.csv');

foreach($files as $file){

    //clean names if needed
    $filename = explode('\\',$file);
    $filename2clean = str_replace('.csv','', $filename[5]);//because my file is under 5 folders on my PC
    $n = strtolower(str_replace('fileprefix_','', filename2clean));

    echo '<br>Create table <b>'.$n.'</b><hr>';

    $sql = "CREATE TABLE IF NOT EXISTS `mydatabase`.`".$n."` (`email` varchar(60), `lastname` varchar(60), `firstname` varchar(60), `country` varchar(19)) DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci;";

    if (!($stmt = $mysqli->query($sql))) {
        echo "\nQuery execute failed: ERRNO: (" . $mysqli->errno . ") " . $mysqli->error;
    };

    echo '<br>Import data from <b>'.$n.'</b><hr>';

    $sql = "LOAD DATA INFILE '".basename($file)."' INTO TABLE `mydatabase`.`".$n."`  
        FIELDS TERMINATED BY ';'
        LINES TERMINATED BY '\r'  
        IGNORE 1 LINES";

    if (!($stmt = $mysqli->query($sql))) {
        echo "\nQuery execute failed: ERRNO: (" . $mysqli->errno . ") " . $mysqli->error;
    };

}

echo '### Import finished !<br>';

@hlosukwakha you want to use mysqlimport . this searches for a table named like the file. use mysqlimport -help to find the correct parameters, but they're basically identical to mysql

Using following shell script:

for file in /directory/*.csv
do
echo "Importing file $file"
       chown mysql $file
        mysql Fortinet -u user -p'password' <<EOF

LOAD DATA LOCAL INFILE '$file'
IGNORE
INTO TABLE tablename
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES;
EOF

echo "Completed importing '"$file"' "

done

I used Python and d6tstack like @citynorman but because I had 24-million-lines in 200 CSV files, that approach was killing my development database server.

This approach gives you a lot of control and performance in 2 or 3 lines of code. It pulled the 24-million rows into a five-index-column MySQL table, and added data clean-ups, in around 2-minutes. The csv import tool on MySQL Workbench was taking days to do the same thing.

Here's what I made work:

import pandas as pd
import importlib
import d6tstack.combine_csv as d6tc
import d6tstack
import glob
import pymysql  # This approach also supports other MySQL connectors
from sqlalchemy import create_engine
engine = create_engine("mysql+pymysql://usr:pass@host:3306/db")

# For testing just pull in one or two csv files - and then take all
# My data had a ; semicolon separator, so change this to your use case if needed
df = d6tc.CombinerCSV(glob.glob('C:/Users/user/Downloads/csvfiles/*.csv'), sep=';').to_pandas()

# Remove Filepath and Filename 
df.drop(columns=["filepath","filename"],inplace=True, axis=1)

# I created Indexes in my database file during testing, so this line
# makes sure there are no null index values in the CSVs
df = df[df['country'].notna()]

# chunksize throttles your database updates so as not to overwhelm any buffers
# NEVER use "if_exists=replace", unless you want to blank your table 100%
df.to_sql(name='table', con=engine, if_exists='append', index=False, chunksize=200)  

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM