简体   繁体   English

解析器不写入 json。 仅写入 {}

[英]The parser does not write to json. Writes only {}

I created a market parser for my own purposes, it works well overall!我为自己的目的创建了一个市场解析器,它总体上运行良好!

Initially faced with a recording problem, gave a decode error.最初遇到一个录制问题,给了一个解码错误。 Now he did something and it disappeared, but now he does not want to parse the data into json, but simply writes 2 characters - {}现在他做了一些事情,它消失了,但现在他不想将数据解析成json,而只是写2个字符 - {}

Here is main.py:这是main.py:

import json
import requests
from bs4 import BeautifulSoup


def get_first_news():
    headers = {
        "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.101 Safari/537.36"
    }

    url = "https://funpay.ru/lots/700/"
    r = requests.get(url=url, headers=headers)

    soup = BeautifulSoup(r.text, "lxml")
    articles_cards = soup.find_all("a", class_="tc-desc-text")

    news_dict = {}
    for article in articles_cards:
        article_title = article.find("div", class_="tc-desc-text").text.strip()
        article_desc = article.find("div", class_="tc-price").text.strip()
        article_url = f'https://funpay.ru/lots/700/{article.get("href")}'

        article_id = article_url.split("=")[-1]

        # print(f"{article_title} | {article_url} | {article_date_timestamp}")

        news_dict[article_id] = {
            "article_title": article_title,
            "article_url": article_url,
            "article_desc": article_desc
        }

    with open("news_dict.json", "w") as file:
        json.dump(news_dict, file, indent=4, ensure_ascii=False)


def main():
    get_first_news()


if __name__ == '__main__':
    main()

Here is test.py这是test.py

# url = "https://www.securitylab.ru/news/520908.php"
#
# article_id = url.split("/")[-1]
# article_id = article_id[:-4]
# print(article_id)
import json

with open("news_dict.json") as file:
    news_dict = json.load(file)

search_id = "520908123"

if search_id in news_dict:
    print("Новость уже есть в словаре, пропускаем итерацию")
else:
    print("Свежая новость, добавляем в словарь")

Here is news_dict.json:这里是 news_dict.json:

{}

In article_cards = soup.find_all ("a", class _ = "tc-desc-text"), we replace "a" with "div" Here's what should come out: article_cards = soup.find_all ("div", class _ = "tc-desc-text")在 article_cards = soup.find_all ("a", class _ = "tc-desc-text") 中,我们将 "a" 替换为 "div" 以下是应该出现的内容: article_cards = soup.find_all ("div", class _ =“tc-desc-文本”)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM