[英]Write Pandas DataFrame into a existing MySQL Database Table
我使用名為test的phpmyadmin創建了一個數據庫,其中包含一個名為client_info的表。 該數據庫中的表是空的(如附圖所示)
另一方面,我在 python 中編寫了一個代碼,該代碼讀取了幾個 CSV 文件,然后將特定列提取到 dataframe 中,稱為Client_Table1 。 這個 dataframe 包含幾行和 3 列
到目前為止,我已經編寫了以下代碼:
import pandas as pd
import glob
path = r'D:\SWAM\ERP_Data' # Path of Data
all_files = glob.glob(path + "/*.csv")
li = []
for filename in all_files:
df = pd.read_csv(filename,sep=';', index_col=None, header=0,encoding='latin-1')
#df = pd.read_csv(filename, sep='\t', index_col=None, header=0)
li.append(df)
ERP_Data = pd.concat(li, axis=0, ignore_index=True)
# rename the columns name
ERP_Data.columns = ['Client_ID', 'Client_Name', 'FORME_JURIDIQUE_CLIENT', 'CODE_ACTIVITE_CLIENT', 'LIB_ACTIVITE_CLIENT', 'NACE',
'Company_Type', 'Number_of_Collected_Bins', 'STATUT_TI', 'TYPE_TI', 'HEURE_PASSAGE_SOUHAITE', 'FAMILLE_AFFAIRE',
'CODE_AFFAIRE_MOUVEMENT', 'TYPE_MOUVEMENT_PIECE', 'Freq_Collection', 'Waste_Type', 'CDNO', 'CDQTE',
'BLNO', 'Collection_Date', 'Weight_Ton', 'Bin_Capacity', 'REF_SS_REF_CONTENANT_BL', 'REF_DECHET_PREVU_TI',
'Site_ID', 'Site_Name', 'Street', 'ADRCPL1_SITE', 'ADRCPL2_SITE', 'Post_Code',
'City', 'Country','ZONE_POLYGONE_SITE' ,'OBSERVATION_SITE', 'OBSERVATION1_SITE', 'HEURE_DEBUT_INTER_MATIN_SITE',
'HEURE_FIN_INTER_MATIN_SITE', 'HEURE_DEBUT_INTER_APREM_SITE', 'HEURE_DEBUT_INTER_APREM_SITE', 'JOUR_PASSAGE_INTERDIT', 'PERIODE_PASSAGE_INTERDIT', 'JOUR_PASSAGE_IMPERATIF',
'PERIODE_PASSAGE_IMPERATIF']
# extracting specific columns
Client_Table=ERP_Data[['Client_ID','Client_Name','NACE']].copy()
# removing duplicate rows
Client_Table1=Client_Table.drop_duplicates(subset=[ "Client_ID","Client_Name" , "NACE"])
我想將 Pandas DataFrame (即Client_Table1 )寫入現有的 MySQL 數據庫(即test ),具體在表client_info中。
the expected output in MySQL Database (i.e., **test**), would be
writing the **Client_ID** column (i.e., values of **Client_ID** column) into MySQL Database column **code**
writing the **Client_Name** column into MySQL Database column **name**
writing the **NACE** column into MySQL Database column **nac**
在您需要的任何數據庫操作的理想情況下:
那只是一個概念。
import pymysql
# Connect to the database
connection = pymysql.connect(host='localhost',
user='<user>',
password='<pass>',
db='<db_name>')
# create cursor
cursor=connection.cursor()
# Insert DataFrame recrds one by one.
sql = "INSERT INTO client_info(code,name, nac) VALUES(%s,%s,%s)"
for i,row in Client_Table1.iterrows():
cursor.execute(sql, tuple(row))
# the connection is not autocommitted by default, so we must commit to save our changes
connection.commit()
connection.close()
那只是一個概念。 我無法測試我編寫的代碼。 可能有一些錯誤。 您可能需要調試它。 例如,數據類型未匹配,因為我將所有行視為帶有 %s 的字符串。 請在此處詳細閱讀。
您可以使用 sql 語句為每個表創建單獨的方法,然后在最后運行它們。 同樣,這只是一個概念,可以更概括。
def insert_into_client_info():
# create cursor
cursor = connection.cursor()
# Insert DataFrame recrds one by one.
sql = "INSERT INTO client_info(code,name, nac) VALUES(%s,%s,%s)"
for i, row in Client_Table1.iterrows():
cursor.execute(sql, tuple(row))
# the connection is not autocommitted by default, so we must commit to save our changes
connection.commit()
cursor.close()
def insert_into_any_table():
"a_cursor"
"a_sql"
"a_for_loop"
connection.commit()
cursor.close()
## Pile all the funciton one after another
insert_into_client_info()
insert_into_any_table()
# close the connection at the end
connection.close()
我今天早上為另一個用戶寫了這個答案,並認為它也可能對你有所幫助。
此代碼從 CSV 文件中讀取,並使用pandas
和sqlalchemy
。
如果您需要任何調整以更具體地幫助您,請告訴我。
下面的代碼執行以下操作:
engine
(連接)已創建。DataFrame
的編輯數據DataFrame
用於將數據存儲到 MySQL 中。 import csv
import pandas as pd
from sqlalchemy import create_engine
# Set database credentials.
creds = {'usr': 'admin',
'pwd': '1tsaSecr3t',
'hst': '127.0.0.1',
'prt': 3306,
'dbn': 'playground'}
# MySQL conection string.
connstr = 'mysql+mysqlconnector://{usr}:{pwd}@{hst}:{prt}/{dbn}'
# Create sqlalchemy engine for MySQL connection.
engine = create_engine(connstr.format(**creds))
# Read addresses from mCSV file.
text = list(csv.reader(open('comma_test.csv'), skipinitialspace=True))
# Replace all commas which are not used as field separators.
# Remove additional whitespace.
for idx, row in enumerate(text):
text[idx] = [i.strip().replace(',', '') for i in row]
# Store data into a DataFrame.
df = pd.DataFrame(data=text, columns=['number', 'address'])
# Write DataFrame to MySQL using the engine (connection) created above.
df.to_sql(name='commatest', con=engine, if_exists='append', index=False)
comma_test.csv
):"12345" , "123 abc street, Unit 345"
"10101" , "111 abc street, Unit 111"
"20202" , "222 abc street, Unit 222"
"30303" , "333 abc street, Unit 333"
"40404" , "444 abc street, Unit 444"
"50505" , "abc DR, UNIT# 123 UNIT 123"
['12345 ', '123 abc street, Unit 345']
['10101 ', '111 abc street, Unit 111']
['20202 ', '222 abc street, Unit 222']
['30303 ', '333 abc street, Unit 333']
['40404 ', '444 abc street, Unit 444']
['50505 ', 'abc DR, UNIT# 123 UNIT 123']
['12345', '123 abc street Unit 345']
['10101', '111 abc street Unit 111']
['20202', '222 abc street Unit 222']
['30303', '333 abc street Unit 333']
['40404', '444 abc street Unit 444']
['50505', 'abc DR UNIT# 123 UNIT 123']
number address
12345 123 abc street Unit 345
10101 111 abc street Unit 111
20202 222 abc street Unit 222
30303 333 abc street Unit 333
40404 444 abc street Unit 444
50505 abc DR UNIT# 123 UNIT 123
這是一個冗長的方法。 但是,每個步驟都被有意分解,以清楚地顯示所涉及的步驟。
試試這個文檔,你必須創建一個連接,然后將數據寫入你的數據庫。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.