简体   繁体   English

Python 抓取并导入为 excel 文件:TypeError: must be real number, not dict

[英]Python scraping and importing as excel file : TypeError: must be real number, not dict

i'm trying in this code, to scrape through a website: https://wish2.ma/product-category/maison-cuisine and save data to my excel file, the logic is working well as you see, i am able to loop through the pages and extract the data i want, but i get stuck at line 70 First here is my code:我正在尝试使用此代码来浏览网站: https://wish2.ma/product-category/maison-cuisine并将数据保存到我的 excel 文件中,如您所见,逻辑运行良好,我能够循环浏览页面并提取我想要的数据,但我卡在第 70 行 首先是我的代码:

from requests_html import HTMLSession
import csv
import html
from bs4 import BeautifulSoup
import requests

import win32com.client as win32

s = HTMLSession()

links = []
for x in range(3,4):
    print(x)
    url = f'https://wish2.ma/product-category/maison-cuisine/page/{x}'
    r = s.get(url)
    items = r.html.find('li.product-type-simple')
    for item in items:
        links.append(item.find('a', first=True).attrs['href'])   

def get_productdata(links):
    r = s.get(link)
    #title = r.html.find('h1', first=True)
    price = r.html.find('span.woocommerce-Price-amount.amount bdi')[0].full_text
    price2 = r.html.find('span.woocommerce-Price-amount.amount bdi')[1].full_text
    tag = r.html.find('a[rel=tag]', first=True).full_text
    category =r.html.find('span.ast-woo-product-category')[0].full_text
    r = requests.get(link)
    soup=BeautifulSoup(r.content.decode('utf-8'), 'html.parser')
    title = soup.find('h1',{'class','product_title'})
    description = soup.find('div',{'class','woocommerce-tabs'}).decode_contents()
    print(title)
    #description = r.html.find('div.woocommerce-tabs')

    product = {
        'title': title.text,
        'price': price.strip(),
        'price2': price2.strip(),
        'tag': tag.strip(),
        'category': category.strip(),
        'description': description
    }
    return product

results = []
#links = get_links()
print(len(links))
ExcelApp = win32.dynamic.Dispatch('Excel.Application')
ExcelApp.Visible = True

for link in links:
    print(link)
    results.append(get_productdata(link))
    break
wb = ExcelApp.Workbooks.Add()
ws = wb.Worksheets(1)

header_labels = ['title','price','price2','tag','category','description']

for indx, val in enumerate(header_labels):
        ws.Cells(1, indx + 1).Value = val
row_tracker = 2
column_size = len(header_labels
for result in results:

    ws.Range(
        ws.Cells(row_tracker, 1),
        ws.Cells(row_tracker, column_size)
    ).value=result
    row_tracker += 1
wb.SaveAs(os.path.join(os.getcwd(),'hhhh.xlsx'), 51)
wb.close()
ExcelApp.Quit()

this is the error message i get when running the script:这是我在运行脚本时收到的错误消息:

Traceback (most recent call last):
  File "C:\Users\kamal\Desktop\amina\scrape_wish.py", line 69, in <module>
    ).value=result
  File "C:\Users\kamal\AppData\Local\Programs\Python\Python310\lib\site-packages\win32com\client\dynamic.py", line 698, in __setattr__
    self._oleobj_.Invoke(entry.dispid, 0, invoke_type, 0, value)
TypeError: must be real number, not dict

i can't understand it, nor how to solve it, please help me.我无法理解它,也无法解决它,请帮助我。

Thank's to Michael i found the issue in this line of code,感谢迈克尔,我在这行代码中发现了问题,

product = {
    'title': title.text,
    'price': price.strip(),
    'price2': price2.strip(),
    'tag': tag.strip(),
    'category': category.strip(),
    'description': description
}

i was trying to paste an object here, instead of an array我试图在这里粘贴一个 object,而不是一个数组

ws.Range(
    ws.Cells(row_tracker, 1),
    ws.Cells(row_tracker, column_size)
).value=result

so i made an array like this one所以我做了一个像这样的数组

product = [title.text,price.strip(),price2.strip(),tag.strip(),category.strip(),description]

and i set a sleeping time to let the excel app save before closing我设置了一个睡眠时间,让 excel 应用程序在关闭前保存

time.sleep(2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 TypeError: float() 参数必须是字符串或实数,而不是 'dict' - TypeError: float() argument must be a string or a real number, not 'dict' Python xlwings - 类型错误:必须是实数,而不是 Timedelta - Python xlwings - TypeError: must be real number, not Timedelta 将 JSON 文件导入 MongoDB 时出现“TypeError: Document must be an instance of dict” - “TypeError: Document must be an instance of dict” while importing JSON file into MongoDB TypeError:必须为str,而不是Python中的dict - TypeError: must be str, not dict in Python TypeError:必须是实数,而不是Entry - TypeError: must be real number, not Entry TypeError:必须是实数,而不是Circle - TypeError: must be real number, not Circle 类型错误:必须是实数,而不是 str - TypeError: must be real number, not str Python 和 AUDINO 通信 - TypeError:必须是实数,而不是 str - Python & auduino communication - TypeError: must be real number, not str python:绘制文本时,引发“ TypeError:必须为实数,而不是str” - python: when plotting text, raise “TypeError: must be real number, not str” CSV 文件,TypeError:float() 参数必须是字符串或实数,而不是 'NoneType' - CSV file, TypeError: float() argument must be a string or a real number, not 'NoneType'
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM