简体   繁体   English

Pandas 数据框在 mysql 数据库中插入每个值

[英]Pandas dataframe insert every value in mysql database

I got a python script that scrapes domain names.我有一个可以抓取域名的 python 脚本。 I just parsed the json response because it renders html code.我只是解析了 json 响应,因为它呈现 html 代码。 I used pandas to read the html and get the body which is the html content.我使用 Pandas 读取 html 并获取body ,即 html 内容。 I tried to print it and then i got the correct values.我试图打印它,然后我得到了正确的值。 Now that i got it, I want to save every result in a mysql database.现在我明白了,我想将每个结果保存在 mysql 数据库中。 How could I achieve it?我怎么能做到呢?

Here is my script这是我的脚本

mydb = mysql.connector.connect(
   host="localhost",
   user="root",
   passwd="",
   database='domainscrape'
)

mycursor = mydb.cursor()
print(mydb)


pageNumber = 0
while True:
    driver.implicitly_wait(3)
    driver.get('https://reversewhois.domaintools.com/?ajax=mReverseWhois&call=ajaxGetPreviewPage&q=%5B%5B%5B%22whois%22%2C%222%22%2C%22VerifiedID%40SG-Mandatory%22%5D%5D%5D&o='+str(pageNumber))
    time.sleep(3)
    pre = driver.find_element_by_tag_name("pre").text
    data = json.loads(pre)
    if data['body']:
        table = data['body']
        tables = pd.read_html(table,skiprows=1)
        df = tables[-1]
        print(df.to_string(index=False))
        pageNumber += 1
        continue
    else:
        break

I got a result like this我得到了这样的结果

  0vh-cl0ud.sg  2017-10-12                                 KEY-SYSTEMS GMBH
  0vh-cloud.sg  2017-10-12                                 KEY-SYSTEMS GMBH
  0vhcloud.sg   2017-10-12                                 KEY-SYSTEMS GMB

Tried saving it to an csv file and i got good results尝试将它保存到一个 csv 文件,我得到了很好的结果

df.to_csv('Domains.csv', mode='a', sep=',',index=False)

but i dont want to import csv to mysql.但我不想将 csv 导入 mysql。 I just want to directly insert the rows in an existing mysql table in it.我只想直接在其中插入现有 mysql 表中的行。

How could i format it like 0vh-cl0ud.sg is the domain 2017-10-12 is the date and KEY-SYSTEMS GMBH is the company?我怎么能把它格式化成0vh-cl0ud.sg是域2017-10-12是日期, KEY-SYSTEMS GMBH是公司? I dit not include the header since in every iteration it prints the header and i dont want it.我不包括标题,因为在每次迭代中它都会打印标题而我不想要它。

It should be something like this:它应该是这样的:

mycursor = mydb.cursor()
mycursor.execute("INSERT INTO table_name(domain, date, company) VALUES ('0vh-cl0ud.sg', '2017-10-12', 'KEY-SYSTEMS GMBH'))"

This piece should be put in the loop after the data is scraped.这块应该在数据被抓取后放入循环中。 Please go through the aforementioned links in comments to have a better understanding of the process.请在评论中浏览上述链接,以便更好地了解该过程。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM