简体   繁体   English

在将数据从python脚本插入MySQL数据库时需要指导

[英]Need guidance in inserting data from python script into a MySQL database

I have crawled a webpage in order to crawl certain information like price, header and so on. 我已经抓取了一个网页,以便抓取某些信息,例如价格,标头等。

Now my goal is to insert the information into a databank. 现在,我的目标是将信息插入数据库。 I already set up the databank with the respective fields that are needed. 我已经用所需的各个字段设置了数据库。

That is my code: 那是我的代码:

def trade_spider(max_pages):
Language = "Japanese"
partner = La
location = Tokyo
already_printed = set()
for reg in Region:
    count = 0
    count1 = 0
    page = -1
    while page <= max_pages:
        page += 1
        response = urllib.request.urlopen("http://www.jsox.de/s/search.json?q=" + str(reg) +"&page=" + str(page))
        jsondata = json.loads(response.read().decode("utf-8"))
        format = (jsondata['activities'])
        g_data = format.strip("'<>()[]\"` ").replace('\'', '\"')
        soup = BeautifulSoup(g_data)

        articles = soup.find_all("article", {"class": "activity-card activity-card-horizontal "})

        try:
            connection = mysql.connector.connect\
                (host = "localhost", user = "root", passwd ="", db = "crawl")
        except:
            print("No connection to Server")
            sys.exit(0)

        cursor = connection.cursor()

        cursor.execute("DELETE from prices_crawled where Location=" + str(location) + " and Partner=" + str(partner))
        connection.commit()

        for article in articles:
            headers = article.find_all("h3", {"class": "activity"})
            for header in headers:
                header_initial = header.text.strip()
                if header_initial not in already_printed:
                    already_printed.add(header_initial)
                    header_final = header_initial


            prices = article.find_all("span", {"class": "price"})
            for price in prices:
                price_end = price.text.strip().replace(",","")[2:]
                count1 += 1
                if count1 > count:
                    pass
                else:
                    price_final = price_end


            deeplinks = article.find_all("a", {"class": "activity-card"})
            for t in set(t.get("href") for t in deeplinks):
                deeplink_initial = t
                if deeplink_initial not in already_printed:
                    already_printed.add(deeplink_initial)
                    deeplink_final = deeplink_initial

                    cursor.execute('''INSERT INTO prices_crawled (price_id, Header, Price, Deeplink, Partner, Location, Language) \
                            VALUES(%s, %s, %s, %s, %s, %s, %s)''', ['None'] + [header_final] + [price_final] + [deeplink_final] + [partner] + [location] + [Language])
                    connection.commit()

        cursor.close()
        connection.close()

trade_spider(int(Spider))

The issue is that the information do not get into the database. 问题是信息不会进入数据库。 Furthermore, I do not get any error message. 此外,我没有收到任何错误消息。 Hence, I do not know what I´m doing wrong. 因此,我不知道我在做什么错。

Could you guys help me out? 你们能帮我吗? Any feedback is appreciated 任何反馈表示赞赏

Is the delete statement working? delete语句有效吗? I think the problem is the way you pass your variables 我认为问题在于传递变量的方式

Change your syntax like this: 像这样更改语法:

sql_insert_tx = "INSERT INTO euro_currencies (pk,currency,rate,date) values (null,'USD','%s','%s')" % (usd,date) sql_insert_tx =“将欧洲货币(pk,currency,rate,date)值插入(null,'USD','%s','%s')” %(usd,date)

cursor.execute(sql_insert_tx) cursor.execute(sql_insert_tx)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM