简体   繁体   English

将MySQL表数据转储到CSV并转换字符编码的最佳方法是什么?

[英]What is the best way to dump MySQL table data to csv and convert character encoding?

I have a table with about 200 columns. 我有一张约有200列的桌子。 I need to take a dump of the daily transaction data for ETL purposes. 我需要为ETL提取日常交易数据。 Its a MySQL DB. 它是一个MySQL数据库。 I tried that with Python both using pandas dataframe as well as basic write to CSV file method. 我尝试通过Python使用pandas数据框以及基本写入CSV文件的方法。 I even tried to look for the same functionality using shell script. 我什至尝试使用shell脚本寻找相同的功能。 I saw one such for oracle Database using sqlplus. 我看到了一个使用sqlplus的oracle数据库这样的数据库。 Following are my python codes with the two approaches: 以下是使用两种方法的python代码:

Using Pandas: 使用熊猫:

import MySQLdb as mdb
import pandas as pd

host = ""
user = ''
pass_ = ''
db = ''

query = 'SELECT * FROM TABLE1'

conn = mdb.connect(host=host,
                   user=user, passwd=pass_,
                   db=db)

df = pd.read_sql(query, con=conn)
df.to_csv('resume_bank.csv', sep=',')

Using basic python file write: 使用基本的python文件编写:

import MySQLdb
import csv
import datetime

currentDate = datetime.datetime.now().date()

host = ""
user = ''
pass_ = ''
db = ''
table = ''

con = MySQLdb.connect(user=user, passwd=pass_, host=host, db=db, charset='utf8')
cursor = con.cursor()

query = "SELECT * FROM %s;" % table
cursor.execute(query)

with open('Data_on_%s.csv' % currentDate, 'w') as f:
    writer = csv.writer(f)
    for row in cursor.fetchall():
        writer.writerow(row)

print('Done')

The table has about 300,000 records. 该表有大约300,000条记录。 It's taking too much time with both the python codes. 这两个python代码花费太多时间。

Also, there's an issue with encoding here. 另外,这里的编码存在问题。 The DB resultset has some latin-1 characters for which I'm getting some errors like : UnicodeEncodeError: 'ascii' codec can't encode character '\\x96' in position 1078: ordinal not in range(128). DB结果集包含一些latin-1字符,我遇到一些错误,例如: UnicodeEncodeError: 'ascii' codec can't encode character '\\x96' in position 1078: ordinal not in range(128).

I need to save the CSV in Unicode format. 我需要将CSV保存为Unicode格式。 Can you please help me with the best approach to perform this task. 您能否以最好的方式帮助我执行此任务。

A Unix based or Python based solution will work for me. 基于Unix或基于Python的解决方案将为我工作。 This script needs to be run daily to dump daily data. 该脚本需要每天运行以转储每日数据。

You can achieve that just leveraging MySql . 您可以利用MySql来实现。 For example: 例如:

SELECT * FROM your_table WHERE...
INTO OUTFILE 'your_file.csv'
FIELDS TERMINATED BY ',' 
OPTIONALLY ENCLOSED BY '"'
FIELDS ESCAPED BY '\'
LINES TERMINATED BY '\n';

if you need to schedule your query put such a query into a file (eg, csv_dump.sql) anche create a cron task like this one 如果您需要安排查询,请将此类查询放入文件(例如csv_dump.sql),然后创建一个像这样的cron任务

00 00 * * * mysql -h your_host -u user -ppassword < /foo/bar/csv_dump.sql

For strings this will use the default character encoding which happens to be ASCII, and this fails when you have non-ASCII characters. 对于字符串,它将使用碰巧是ASCII的默认字符编码,当您使用非ASCII字符时,它将失败。 You want unicode instead of str. 您需要unicode而不是str。

rows = cursor.fetchall()
f = open('Data_on_%s.csv' % currentDate, 'w')
myFile = csv.writer(f)
myFile.writerow([unicode(s).encode("utf-8") for s in rows])
fp.close()

You can use mysqldump for this task. 您可以将mysqldump用于此任务。 ( Source for command ) 命令来源

mysqldump -u username -p --tab  -T/path/to/directory dbname table_name --fields-terminated-by=',' 

The arguments are as follows: 参数如下:

  • -u username for the username -u username作为用户名
  • -p to indicate that a password should be used -p表示应该使用密码
  • -ppassword to give the password via command line -ppassword通过命令行提供密码
  • --tab Produce tab-separated data files --tab产生制表符分隔的数据文件

For mor command line switches see https://dev.mysql.com/doc/refman/5.5/en/mysqldump.html 有关mor命令行开关的信息,请参见https://dev.mysql.com/doc/refman/5.5/en/mysqldump.html

To run it on a regular basis, create a cron task like written in the other answers. 要定期运行它,请创建一个cron任务,如其他答案中所述。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将 csv 数据转换为 dict 的最佳方法 - Best way to convert csv data to dict 将CSV文件数据导入数据库的最佳方法是什么? Python(猎鹰),Angular,MySQL - What is best way to get CSV file data into database? Python (Falcon), Angular, MySQL 什么是清除CSV并加载到mysql的最佳方法 - what's the best way to clean CSV and load to mysql 如何使用 fetchmany 将 mysql 表转储到 csv 中 - How to dump mysql table into csv using fetchmany 在 Tkinter 的表格中显示数据的最佳方式是什么? - What is the best way to show data in a table in Tkinter? 从python中的csv数据生成和更新图表的最佳方法是什么 - What is the best way to generate and update charts from csv data in python 将字符串数组转换为表格的最佳方法是什么? - What's the best way to convert string array into a table? 有没有办法通过 python 中的 REAST API 从 Google SQL 云中转储表数据(json 或 csv)? - Is there a way to dump a table data (json or csv) from the Google SQL Cloud via a REAST API in python? Django 迁移:除了使用第三方应用程序之外,在数据库中转储和加载数据的最佳默认方法是什么? - Django Migratons: What is the best default way to dump and load data in DB apart from using third party app? CSV 数据到 MySQL 表 - CSV data to MySQL table
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM