簡體   English   中英

我在使用 python 將實時數據拉入 mysql 數據庫時抓取實時數據時遇到錯誤(TypeError: not enough arguments for format string)

[英]I'm getting an error when scraping real-time data while pulling it into mysql database with python (TypeError: not enough arguments for format string)

我有 links.csv 文件,我使用 pd.read_csv 來一一獲取這些鏈接。 我的 csv 文件看起來像這樣: https://im.ge/i/1i38PY

除了第四個鏈接,我可以根據我的第一個代碼實時刪除將我帶到該站點的鏈接中的所有信息。 數據直接保存在 mysql CARFINAL 表中,如下所示: https://im.ge/i/1igBaG

我為第四個鏈接得到的錯誤是這個-> TypeError: not enough arguments for format string ///// print(df) 看起來像這樣-> https://im.ge/i/1i8tgT

這是我在底部的第一個代碼;


    cursor = scrap_db.cursor()

    
    # Drop table as per requirement
    # cursor.execute('DROP TABLE IF EXISTS CARFINAL')

    # Create table as per requirement

    sql = """CREATE TABLE CARFINAL(
        brand VARCHAR(120),
        model VARCHAR(120),
        model_version VARCHAR(120),
        location VARCHAR(60),
        price VARCHAR(80),
        dealer VARCHAR(60),
        contact_name VARCHAR(60),
        tel_number VARCHAR(50),
        mileage VARCHAR(50),
        gearbox VARCHAR(60),
        first_registration VARCHAR(30),
        fuel_type VARCHAR(120),
        power VARCHAR(60),
        seller VARCHAR(60),
        body_type VARCHAR(30),
        type VARCHAR(10),
        drivetrain VARCHAR(10),
        seats int(11),
        doors int(11),
        country_version VARCHAR(20),
        offer_number VARCHAR(20),
        model_code int(11),
        production_date int(11),
        general_inspection int(11),
        previous_owner int(11),
        full_service_history VARCHAR(10),
        non_smoker_vehicle VARCHAR(10),
        engine_size VARCHAR(20),
        gears VARCHAR(10),
        cylinders VARCHAR(10),
        fuel_consumption VARCHAR(60),
        CO2_emissions VARCHAR(30),
        energy_efficiency_class VARCHAR(10),
        CO2_efficiency VARCHAR(80),
        emission_class VARCHAR(20),
        emissions_sticker VARCHAR(10),
        colour_and_upholstery VARCHAR(60),
        all_equipment VARCHAR(300),
        vehicle_description VARCHAR(400),
        car_picture_link VARCHAR(200),
        link VARCHAR(200)
        )"""

    cursor.execute(sql)
    
    
    
    #Save data to the table

    #scrap_db = pymysql.connect(host='localhost',user='root',password='****',database='autoscout',charset='utf8mb4',cursorclass=pymysql.cursors.DictCursor)
                                                
    mySql_insert_query = """INSERT INTO CARFINAL
        (brand,
        model,
        model_version,
        location,
        price,
        dealer,
        contact_name,
        tel_number,
        mileage,
        gearbox,
        first_registration,
        fuel_type,
        power,
        seller,
        body_type,
        type,
        drivetrain,
        seats,
        doors,
        country_version,
        offer_number,
        model_code,
        production_date,
        general_inspection,
        previous_owner,
        full_service_history,
        non_smoker_vehicle,
        engine_size,
        gears,
        cylinders,
        fuel_consumption,
        CO2_emissions,
        energy_efficiency_class,
        CO2_efficiency,
        emission_class,
        emissions_sticker,
        colour_and_upholstery,
        all_equipment,
        vehicle_description,
        car_picture_link,
        link
        )
            VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s) """
        for row_count in range(0, df.shape[0]): # range(0,1) 
            chunk = df.iloc[row_count:row_count + 1,:].values.tolist()    
        tuple_of_tuples = tuple(tuple(x) for x in chunk)
            
        cursor = scrap_db.cursor()
        cursor.executemany(mySql_insert_query, tuple_of_tuples) 
        scrap_db.commit()
        print(cursor.rowcount, "Record inserted successfully into CARFINAL table")
        scrap_db.close()

len_of_links = len(make_model_ads_data_latest)
number = np.arange(4,5)
j = 0
for i in tqdm(number):
    ad_link = make_model_ads_data_latest['ad_link'][i]
    #ad_link = make_model_ads_data_latest['ad_link'][i+1] #BAK
    
    if ad_link not in make_model_ads_data['link'].values:
        data = get_ad_data(ad_link = ad_link, sleep_time = 0)
        j = j + 1
        
print("scraped ", j, " new ads")           

順便說一句,如果我使用我的 2. 代碼; 有用。 我剛剛將我的第一個代碼替換為這樣的代碼; ......

sql = """CREATE TABLE CAR2(
        brand VARCHAR(120),
        model VARCHAR(120),
        model_version VARCHAR(120),
        location VARCHAR(60),
        price VARCHAR(80),
        dealer VARCHAR(60),
        contact_name VARCHAR(60),
        tel_number VARCHAR(50),
        mileage VARCHAR(50),
        gearbox VARCHAR(60),
        first_registration VARCHAR(30),
        fuel_type VARCHAR(120),
        power VARCHAR(60),
        seller VARCHAR(60),
        body_type VARCHAR(30),
        type VARCHAR(10),
        seats int(11),
        doors int(11),
        country_version VARCHAR(20),
        model_code  VARCHAR(20),
        engine_size VARCHAR(20),
        colour_and_upholstery VARCHAR(30),
        all_equipment VARCHAR(300),
        vehicle_description VARCHAR(400),
        car_picture_link VARCHAR(200),
        link VARCHAR(200)
        )"""

    cursor.execute(sql)
    
    
    
    #Save data to the table

    #scrap_db = pymysql.connect(host='localhost',user='root',password='1234',database='autoscout',charset='utf8mb4',cursorclass=pymysql.cursors.DictCursor)
                                                
    mySql_insert_query = """INSERT INTO CAR2
        (brand,
        model,
        model_version,
        location,
        price,
        dealer,
        contact_name,
        tel_number,
        mileage,
        gearbox,
        first_registration,
        fuel_type,
        power,
        seller,
        body_type,
        type,
        seats,
        doors,
        country_version,  
        model_code,
        engine_size,
        colour_and_upholstery,
        all_equipment,
        vehicle_description,
        car_picture_link,
        link
        )
            VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s) """

但是我不想每次刮的時候都改變結構

我被困。 請幫忙

如果我對問題的理解正確,則問題出在項目較少的行上。 可以使用例如 pd.fillna 來填充此行嗎?

換句話說,確保您的元組始終具有所需的大小。 另一種選擇是添加 [''] * (desired_len - act_len) 左右。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM