简体   繁体   English

Web Scraping (Python 3) 的代码中有语法错误?

[英]Having a Syntax error in the code of Web Scraping (Python 3)?

from bs4 import BeautifulSoup as soup
from urllib.request import urlopen as uReq

my_url = 'https://www.flipkart.com/search?q=iphone+12&sid=tyy%2C4io&as=on&as-show=on&otracker=AS_QueryStore_OrganicAutoSuggest_1_6_na_na_na&otracker1=AS_QueryStore_OrganicAutoSuggest_1_6_na_na_na&as-pos=1&as-type=HISTORY&suggestionId=iphone+12%7CMobiles&requestId=71ed5a8e-4348-4fef-9af8-43b7be8c4d83'

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")

containers = page_soup.findAll("div", {"class": "_13oc-S"})
#print(len(containers)) "will tell number of products on the respected page"
#print(len(containers))

#print(soup.prettify(containers[0])) "will bring the page in the organised manner"
#print(soup.prettify(containers[0]))

container=containers[0]
#print(container.div.img["alt"]) "will display the name of the respected product"
#print(container.div.img["alt"])

#price=container.findAll("div",{"class":"col col-5-12 nlI3QM"}) "will tell the price of the respect project"
price=container.findAll("div",{"class":"col col-5-12 nlI3QM"})
#print(price[0].text)

ratings=container.findAll("div",{"class":"gUuXy-"})
#print(ratings[0].text)

#Making a file
filename="products.csv"
f= open(filename, "w")

#Naming the headers
headers="Product_Name,Pricing,Ratings\n"
f.write(headers)

for container in containers:
    product_name = container.div.img["alt"]

    price_container = container.findAll("div", {"class": "col col-5-12 nlI3QM"})
    price = price_container[0].text.strip()

    rating_container = container.findAll("div", {"class":"gUuXy-"})
    rating = rating_container[0].text 

    #print("product_name:" + product_name)
    #print("price:" + price)
    #print("ratings:" + rating)

    #string parsing
    trim_price = ''.join(price.split(','))
    rm_rupee = trim_price.split("&#8377")
    add_rs_price = "Rs." + rm_rupee[0]
    split_price = add_rs_price.split('E')
    final_price = split_price[0]

    split_rating = rating.split(" ")
    final_rating = split_rating[0]

    print(product_name.replace(",", "|") + "," + final_price + "," + final_rating + "\n")
    f.write(product_name.replace(",", "|") + "," + final_price + "," + final_rating + "\n")

f.close()

f.write(product_name.replace(",", "|") + "," + final_price + "," + final_rating + "\\n")

Having a syntax error in this specific line , I want to make a .CSV file but the products are not coming in the respected file.在此特定行中出现语法错误,我想制作一个 .CSV 文件,但产品没有出现在受尊重的文件中。 The Syntax error is -: Exception has occurred: UnicodeEncodeError 'charmap' codec can't encode character '\₹' in position 35: character maps to File "D:\\Visual Code Folder\\Python\\Scraping_Flipkart.py", line 61, in f.write(product_name.replace(",", "|") + "," + final_price + "," + final_rating + "\\n")语法错误是 -:发生异常:UnicodeEncodeError 'charmap' codec can't encode character '\₹' in position 35: character maps to File "D:\\Visual Code Folder\\Python\\Scraping_Flipkart.py", line 61,在 f.write(product_name.replace(",", "|") + "," + final_price + "," + final_rating + "\\n")

Replace this替换这个

f= open(filename, "w")

with this有了这个

import io
f = io.open(filename, "w", encoding="utf-8")

Using io gives you backward compatibility with Python 2.使用 io 可以向后兼容 Python 2。

If you only need to support Python 3 you can use the builtin open function instead:如果您只需要支持 Python 3,您可以使用内置的 open 函数:

with open(fname, "w", encoding="utf-8") as f:
    f.write(html)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM