簡體   English   中英

可以使用 python 將數據和表結構從 csv 導入到 mysql

[英]Could import data and table structure from csv to mysql using python

我有帶有標題的 csv 文件,我的任務是使用此文件創建架構和表並將數據導入 mysql\mssql 數據庫,我發現了如何從 csv 動態創建架構的好文章,但我遇到了 2 個問題:

  1. function 正在識別 boolean 類型,如 varchar(5)
cm_mac varchar(0),
partnerid varchar(0),
version varchar(0),
accountid varchar(0),
securityedgeenabled varchar(5)); <-- should be boolean
  1. 嘗試導入數據時,獲取
"Exception has occurred: AttributeError
'Cursor' object has no attribute 'cursor'" "Exception has occurred: ProgrammingError
not all arguments converted during bytes formatting")

誰能幫我解決這些問題? 我提到的這篇文章的鏈接

我正在嘗試執行的代碼

from flask import Flask, request, jsonify
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy import Integer, Enum
from flask_marshmallow import Marshmallow
import os
import enum
import csv
import ast
from sqlalchemy.sql.sqltypes import Boolean
import mysql.connector
import MySQLdb

f = open('c:\Projects\python\splunk\secedge072120.csv', 'r')
reader = csv.reader(f)

longest = []
type_list = []
headers = []


def dataType(val, current_type):
    try:
        # Evaluates numbers to an appropriate type, and strings an error
        ###!!!Here needs to add boolean!!!###
        t = ast.literal_eval(val)
    except ValueError:
        return 'varchar'
    except SyntaxError:
        return 'varchar'
    if type(t) in [int, float]:
        if (type(t) in [int]) and current_type not in ['float', 'varchar']:
            # Use smallest possible int type
            if (-32768 < t < 32767) and current_type not in ['int', 'bigint']:
                return 'smallint'
            elif (-2147483648 < t < 2147483647) and current_type not in ['bigint']:
                return 'int'
            else:
                return 'bigint'
        if type(t) is float and current_type not in ['varchar']:
            return 'decimal'
    elif (type(t) is Boolean or bool):
        return 'boolean'
    else:
        return 'varchar'


for row in reader:
    if len(headers) == 0:
        headers = row
        for col in row:
            longest.append(0)
            type_list.append('')
    else:
        for i in range(len(row)):
            # NA is the csv null value
            if type_list[i] == 'varchar' or row[i] == 'NA':
                pass
            else:
                var_type = dataType(row[i], type_list[i])
                type_list[i] = var_type
        if len(row[i]) > longest[i]:
            longest[i] = len(row[i])
f.close()


insert_headers = (tuple(headers))
insert_values = ()
insert_values = tuple('?' for header in headers)
statement = 'create table if not exists stack_overflow_survey ('

for i in range(len(headers)):
    if type_list[i] == 'varchar':
        statement = (
            statement + '\n{} varchar({}),').format(headers[i].lower(), str(longest[i]))
    else:
        statement = (statement + '\n' + '{} {}' +
                     ',').format(headers[i].lower(), type_list[i])

statement = statement[:-1] + ');'


print(statement)


mydb = MySQLdb.connect(user='root', password='1234',
                       host='127.0.0.1',
                       database='employees')


cur = mydb.cursor()

csv_data = open(r'c:\Projects\python\splunk\secedge072120.csv', 'r')
reader = csv.reader(csv_data)
print(type(reader))
for row in reader:
    print(row)
    ###!!!Here error is happening !!!###
    cur.execute(f'INSERT INTO stack_overflow_survey({headers})' <--Error
                f'VALUES({insert_values})', row)
mydb.commit()
mydb.close()


# cur.execute(statement)
# csv_data = csv.reader('c:\Projects\python\splunk\secedge072120.csv')

感謝您的任何幫助

  • 對於類型檢測,function 通過嘗試將類型評估為 Python 常量來識別類型; 對於布爾值,這意味着只接受“真”和“假”,而不接受“真”或“真”。 您可能希望在頂部添加一個額外的子句以識別布爾值,無論大小寫:

     if val.lower() in ('true', 'false'): return 'boolean'
  • 我不確定您粘貼的錯誤,但肯定 INSERT 語句在使用({headers})({insert_values})的地方存在格式問題; 打印並調整它,這樣你就沒有多余的括號和引號。 例如,您可以使用({", ".join(headers)})({", ".join(insert_values)}) (如果您的標題可以包含空格或需要在 SQL 中引用,則更復雜)。

sabik,謝謝,通過添加此代碼,我幾乎所有的字段都是 boolean:

try:
        # Evaluates numbers to an appropriate type, and strings an error
        ###!!!Here needs to add boolean!!!###
        t = ast.literal_eval(val)
    except ValueError:
        if val.lower() in ('true', 'false') or ('True', 'False'):
            return 'boolean'
        else:
            return 'varchar'
    except SyntaxError:
        if val.lower() in ('true', 'false') or ('True', 'False'):
            return 'boolean'
        else:
            return 'varchar'
    if type(t) in ('true', 'false') or ('True', 'False'):
        return 'boolean'

通過檢查我的查詢,我得到了這個 query_result = (f'INSERT INTO stack_overflow_survey {insert_headers}' f'VALUES {insert_values}')

INSERT INTO stack_overflow_survey ('cm_mac', 'PartnerId', 'Version', 'AccountId', 'SecurityEdgeEnabled')VALUES ('?', '?', '?', '?', '?')字段名稱應該不帶引號,我怎么能用元組或列表來實現呢? 謝謝

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM